Properties – A False Sense of Encapsulation
I’m having more and more mixed feelings about properties in .NET, or accessor methods in general, as properties compile to get_xxx/set_xxx methods behind the covers. I’ve never thought about it that much, but now they seem to have become a warning flag of bad design.
Everyone knows by now that public fields are evil. I’m not going to elaborate on that anymore. However, I’ve come to believe that properties are not the answer either. When a class has a particular number of private fields, then the general malpractice is to also include corresponding properties for these private fields. Whether these properties are read-only or include both getters and setters is not important. The main point is that this habit should be questioned. To me, the use of properties should be applied with great care.
But first, let me make a clear distinction between objects and data structures. An object hides its data and exposes methods to operate on that data. These methods provide the behavior of an object. A data structure, also known as a Data Transfer Object (DTO), only exposes its data and has no methods. They can be seen as data carriers that have no behavior whatsoever. Objects are best used in domain models, while DTO’s, as their names imply, are best used for transferring data from one layer to another.
Why this distinction? For starters, because I know a lot of people who talk about DTO’s as if they were full blown objects. They use DTO’s as their primary domain inhabitants which leads to the Anemic Domain Model anti-pattern. And secondly, because the use of properties mentioned in this post obviously applies to objects and not DTO’s.
As I already stated in the definition above, objects hide their data. Relentlessly exposing this data through properties breaks encapsulation of these objects. You might say that this isn’t the case because properties ‘hide’ the private member variables of the class. But properties are just another intricate way of exposing private data.
Let’s go through some examples of bad code, shall we.
Inconsistent objects
Suppose we have a class named Company in our domain. Something we see very often is something like the following:
public class Company { private String _city; private String _country; public String City { get { return _city; } set { _city = value; } } public String Country { get { return _country; } set { _country = value; } } }
Suppose we have the requirement that the user of our system can change the location of our company in case it spontaneously settles itself in a different country. The calling code would look something like this:
company.City = "Erps-Kwerps"; company.Country = "Belgium";
As you can see, this operation is clearly not atomic. It allows you to have inconsistent objects. Suppose the location was previously Hong Kong, China. The object becomes inconsistent after having the City property modified to ‘Erps-Kwerps‘, which definitely a long way from China.
The better approach would be the following:
public class Company { private String _city; private String _country; public void ChangeLocationTo(String city, String country) { _city = city; _country = country; } }
Removing the properties guarantees encapsulation. Providing the ChangeLocationTo method not only makes the requirements explicit in code, but it also ensures that the operation is performed atomically.
The calling code gets really easy now:
company.ChangeLocation("Erps-Kwerps", "Belgium");
The most important thing here is that code that belongs to the Company class has now moved to where it belongs.
Procedural code
This is probably the anti-pattern I see he most: procedural programming in an OO language aka Cobol in C#. Behold the following application service:
public class BeerService { public void OrderDrink(Int64 customerId) { Customer customer = Repository<Customer>.Get(customerId); if(customer.Age > 16) { // Cheers! } } } public class Customer { private Int32 _age; public Int32 Age { get { return _age; } set { _age = value; } } }
This very simple example is quite obvious. The logic in the application service extracts the data from the object. The OO way would be to put the logic where the data is, namely the Customer class:
public class Customer { private Int32 _age; public Boolean IsOlderThanMinimumAge() { return customer.Age > 16; } public Boolean CanHaveBeer { if(IsOlderThanMinimumAge()) { return true; } return false; } }
See, no more properties and most importantly, we now adhere to Tell, Don’t Ask principle as described in The Pragmatic Programmer. We don’t ask for the data, we tell the object that has the data to do the work for us.
Hello Real World
I know these examples are very simplistic and that real-life isn’t always as shiny. There are a lot of cases where adding properties is the right thing to do. The first thing that comes to mind are unit tests on domain objects. Suppose I write a unit test that calls a particular method on an object. The unit test probably wants to verify the state (data) of the domain object. I’ve seen solutions from some developers who let the domain object handle the assertions, which implies that the assemblies of the unit test framework of choice would be deployed in the production environment.
Another problematic scenario would be a mapping class, which is responsible for mapping the data of a domain object to a DTO. One could argue that this could be done by the domain object itself, but what if there are multiple DTO’s (different views) for the same domain object. Also smells like SRP violation to me. Another option would be to use reflection.
A while ago, I was exploring the code of the TimeAndMoney DDD example (yes, Java code). While I was reading the code of the Money class, I found this intention-revealing approach for adding getters to a class: (Mayday, Mayday, Java code coming up 😉 )
/** * How best to handle access to the internals? It is needed for * database mapping, UI presentation, and perhaps a few other * uses. Yet giving public access invites people to do the * real work of the Money object elsewhere. * Here is an experimental approach, giving access with a * warning label of sorts. Let us know how you like it. */ public BigDecimal breachEncapsulationOfAmount() { return amount; } public Currency breachEncapsulationOfCurrency() { return currency; }
Maybe role interfaces described by Martin Fowler and popularized by Udi Dahan can provide a clean solution here. A domain object that implements two types of interfaces:
- interfaces containing nothing but property getters (query).
- interfaces containing a single method (command).
If you, my dear reader, have any thoughts about this, I’m glad to hear them.
Conclusion
As always, there are two camps here: on the left are the people who don’t believe that properties violate encapsulation and on the right are the people who despise properties and never ever use them in their code. I’m currently sitting somewhere in between (I keep seeing gray ;-)). I do believe that we should quit with the habit of providing properties for private member variables by default. We should only provide access to private data when there is no other clean alternative. As always, being dogmatic is not a good thing. Reducing visibility of data as much as possible leads to more reliable and maintainable code. We should beware of code that calls more than a single method on the same object. Properties are not evil, but in a lot of cases inappropriate. We should be able to resist to this temptation.
To round of this long post, let me just point you to an article I found on Java World (yes, again Java) written in 2003 by Allen Holub, named Why getter and setter methods are evil (don’t ask how I found it) which explains the case I’m trying to make with this post in a much clearer way.
Update 09/27/2008: It seems that this topic is also being discussed on the ALT.NET user group.
Shouldn’t you be using a MinimumDrinkingAgeChecker instead of pushing that business logic check onto the customer?
@David: In this simple example, pushing it to a seperate class would make the Customer class anemic and would make properties a necessity again. One should put the behaviour where the data is (the Customer object).
Anemic or not, I have to agree with David. You will already have to modify CanHaveBeer() to accept some sort of CurrentLocation, because your Customer wouldn’t be old enough to drink when visiting my country 🙂
Consider the more plausible example of determining if an order qualifies for free shipping. Are you telling me you would hang that behavior on your Order domain object? Me thinks not.
@Bill: if making Customer culture-aware is not a requirement, leaving the code in Customer makes sense because everything is self-contained.
This isn’t comparable with Free Shipping as that code requires interaction from all sorts of entities, it makes sense to have a Service orchestrate that.
If, through agile design, we’ve uncovered a new requirement [that we’re going to be operating bars all over the world] then we just change the design and promote the age code to a culture-aware service; otherwise, YAGNI.
I think those yelling YAGNI are missing the point — once you *do* need it, one of the premises of this blog post falls down — i.e. that business rules belong inside of domain objects.
He’s not saying ‘never use services,’ and neither am I. So what does that have to do with the premise (that reliance on properties are a warning of bad design’)?
@OAB: There are a lot of scenarios where Domain services are the right thing to do. As you know, they are part of the domain as well. The gist of this post is to avoid getters/setters until you have a valid case for them.
Building an anemic domain model for a complex system is an anti-pattern I’ve seen too much the last couple of years. Such a code base typically has a lot of duplicate code an is very hard to maintain.
An anemic domain is perfectly valid in simple, data viewing applications. This is the active-record scenario where the objects are DTO’s.
All I’m saying is that by putting the behavior where the data is, I’m achieving more orthogonality. This is what OOP is all about.
You can read my view on Data Structures and Encapsulaed Objects here: http://morten.lyhr.dk/2008/10/data-structures-vs-encapsulation.html
Maybe I am to old school on Object design. I always thought of properties as fairly obvious. For example location is a property. It probably should be readonly but it is a property of a person no matter what.
Changing location is a behavior of a person. So by allowing the user to change the location property without invoking a method (in this case) to re locate themselves seems wrong.
Color might be another good example. I can not change my color with out invoking some behavior (golaybythepool(…) – which right now sounds like a good idea).
So I guess I would say that you should expose the properties as read only, since they are properties of the object. Then provide methods to change readonly properties acording to the buisness rules.
Either way properties can include buisness rules as needed. I you want simple poperties you should use the new automatic properties in C# 3.0 and then they are truly simple properties with no buisness rules asssociated with them. By using automatic properties you enforce that restriction. That way if you write the code for a property get/set as above it would imply you have additional logic to enforce beyond setting the local variable..
I have mixed feelings about automatic properties. To me, they like public member variables all over again.
I try to only incorporate a property when I’m with my back against the wall. If I can deliver cleaner code by incorporating a method, than I always choose for this option.
Definitely going to give this a good read before responding but on the Time And Money example, I think this is a special case where you’re breaking the encapsulation provided by a simple value object (Money) to provide access to the “raw” value. Design wise that’s a very different world from using a property to allow you to set a Customers Address vs. using methods for get/set.