Eliminating Comments: the Road to Clarity

April 18th, 2010
I used to think commenting my code was the responsible thing to do. I used to think that I should have a comment for just about every line of code that I wrote. After my first read of Code Complete, my views changed pretty drastically. I began to value good names over comments. As my experience has increased, I have realized more and more that comments are actually bad. They indicate a failure. In Clean Code Robert Martin says
"The proper use of comments is to compensate for our failure to express yourself in code. Note that I used the word failure. I meant it. Comments are always failures."
http://i206.photobucket.com/albums/bb19/youdumbcat/EpicFail02.jpg Those are some pretty strong words. You may be reeling back a bit after reading that. It is a normal reaction, but if you'll drop your defenses for a second, I'll help you understand why.

Let's start with an example

Which is more clear to read? This:
int i = 5;  // Start at 5, since there are 5 widgets in our production system.
//  If the number of widgets is evenly divisible by the time, 
//  and we have exactly 5 sockets we need a new one.
if((i % this.t) == 0  && s.Count() == 5)
{
     i++
}
Or:
int currentWidgetCount = DEFAULT_PRODUCTION_WIDGET_COUNT;
if(ShouldCreateNewWidget(currentWidgetCount))
{
     currentWidgetCount++
}

private bool ShouldCreateNewWidget(int count)
{
     return CountEvenlyDivisibleByTime(count) && SocketCountAtThreshold();
}
Even though the second example is more code, the code is self-documenting. The names of the variables and methods themselves are telling you what the code is doing.  They are also serving to abstract out the concepts so that you don’t have to think about the entire piece of logic at one time.  This simplifies your code. In the above example, you don’t have to look at the method ShouldCreateNewWidget.  You can deduce that the flow of that logic checks to see if it should create a new widget based on the count, and if it should it does. If you want to know how it is determined if a new widget should be created, then you drill into the method, and you can see that if the count is evenly divisible by time and the socket count is at the threshold, then the answer is true, otherwise it is false. Consider again the first example.  What happens if that logic changes, but the comments aren’t updated?  When you read the comments, how do you know that they are even correct unless you read through and understand the logic anyway?

Why are comments so bad?

It is not that comments in and of themselves are bad.  It’s more that having comments in your code indicates that your code itself is not clear enough to express its intent.  Elegant code should be able to express itself simply and completely.  Comments are not code, they are metadata about the code.  If you want to write elegant code, write elegant code, not elegant prose about code. Here are some of the major reasons why comments are bad:
  • They do not change with the code. If a refactor tool changes the code, it cannot change the semantics of the comments, it can only change the syntax where it can recognize keywords.
  • A wrong comment is worse than horribly complex non-commented code. A wrong comment can lead someone completely in the wrong direction.  For that reason, a good developer always assumes comments are wrong.
  • A comment almost always expresses an absolute thing, where a well named variable or method can express an abstract concept. Comments can actually tightly couple your code.  Take the example above where int i = 5 had a comment stating that 5 was the default.  DEFAULT_PRODUCTION_WIDGET_COUNT communicates the same information, but at a higher level of abstraction.
  • Comments don’t show up in stack traces. I have looked through many stack traces in my career, and I can tell you for a fact, that when you are looking at a stack trace, you would much prefer self-documenting method names, than good comments.
  • Reading comments is optional. Reading code is mandatory.  You cannot count on anyone reading your comments.  You cannot force them to do so, but they must read the code. If you have something important to say, put it in the code.

NDoc and JavaDoc (XML style comments)

I am moving more and more away from even liking comments on public methods using JavaDoc or NDoc.  Most of the time when we create NDoc style documentation, we end up just restating exactly what the parameter name is.  How useful, really, is: cookieCount – the count of cookies? Does that actually help someone using your code?  I would much rather make method names that have intuitive names and parameters than follow an arbitrary convention that adds no real value. There is even a tool that auto creates NDoc comments for you.  Seriously, I used to think that was cool, but really how stupid is that?  Auto-generating comments?  Before you disagree with me here, really think about if embedding XML into your source code is really adding any value. I understand creating MSDN style documentation is pretty cool.  I’ve used NDoc and SandCastle myself.  But, wouldn’t the effort be better spent creating more meaningful names and self-explanatory APIs?

Exceptions

Sometimes, you will need to write comments.  But, it should be the exception not the rule.  Comments should only be used when they are expressing something that cannot be expressed in code.  If you are using some mathematical equation, you might reference that equation in a comment, rather than trying to codifying it. If you want to write elegant code, strive to eliminate comments and instead write self-documenting code. Update: Just wanted to add this link to my original post on Elegant Code. And the link to my main personal blog at http://simpleprogrammer.com, since some people were getting confused.  Also, you can follow me on twitter here.

  • Anu

    @Steve : I do agree that high level methods should be written in the controller/dispatcher style as you’ve posted above (I’m sure there’s a name for it). I use it wherever I can and they generally never require comments.

    I guess I took the example in your post too literally. What I meant was, when it comes to the lower level methods that really just “do one thing” (like the example in the blog post) if you split it up into multiple, non-reusable methods just to have explanatory method names, it actually becomes more work for the reader to understand it.

    So yes, the higher level methods should definitely follow a self documenting pattern, but I think there is an obvious limit to how much you split up the lower level methods.

  • Jon

    The first example is clearer, at least if you use a trick—the trick of ignoring all the comments. It remains a better starting point for a real solution:

    int widgetn = 5;
    if(!(widgetn % this.time) && sockets.Count() == 5){
    i ;
    }

    Of course, widgetn could probably be even shorter, maybe even wn, if it was fed from a meaningfully-named function.

    Ultimately, the biggest reason comments obscure code is because they space it out. Giant long names and premature abstraction does precisely the same thing.

  • http://py-sty.blogspot.com Steve Py

    John’s example is a bit confusing because if its simplicity. I’m sure he has more meaningful examples contained in production code but that’s not exactly something that’s prudent to publish on a personal, public blog. :)

    My own comments on the subject hinge on moderation. It’s not to say I never write comments because I accept that writing elegant code is accepting and fostering the evolution of code. The initial forms of the solution of a problem are not ideal, they’re riddled with assumptions, missing bits here, and superfluous bits there. This code is typically heavily commented with notes, assumptions, and lots of “Whys”. (I’m not a TDD, I’m a TSD [Test-Soon-Dev]) As the assumptions are confirmed as facts or discarded, the code is cleaned out and re-factored. Behavioural tests are added to reflect the solidified requirements, and comments are deleted. Maybe I can’t justify the time/cost to re-factor everything so some comments may remain to warn of Dragons. Some would argue that this solidification should happen during a design phase, before meaningful code is written; However I subscribe to the belief that code and tests are the design, the running, working application is the deliverable.

  • Pingback: Bullshit around the net - intotheweb

  • http://www.intotheweb.org Steven
  • http://strangelights.com/blog Robert Pickering

    Comments are not inherently bad, but comments that simple describe the code are generally undesirable. I think a good use of comments is to describe why something is the way it is. There’s almost always more than one implemented an algorithm, so a comment about why you choose than method can be quite useful (especially if you already tried another method and saw it fail).

  • http://applanet.com/wickedleancode Rjae Easton

    Excellent work John. Steve’s comments are well written and thoughtful. Without tests to describe (and document) the evolving whys and wherefores – even greater pressure is put on naming, SRP, and other clean code practices.

    I wrote about this very topic here for anyone interested: http://applanet.com/2007/01/Documenting-System

  • Pingback: Good Coding Practices Should Replace Comments In Code « dougv.com « Doug Vanderweide

  • http://www.humblecode.com Jonny

    Great article. Names, names, names – I’m telling you if I had 1 penny for every time I advised someone to change a name of variable or method because it wasn’t clear, I would be rich.

    Not only do clear names improve the clarity of the code – but they can possibly further develop your ability to generalise and become more abstract – a skill I value highly.

    Working with programmers who aren’t usually English can be interesting though.

  • Anonymous

    Sorry but you FAIL. Method names should be meaningful, with your “little” improvements they aren’t anymore. Your first chunk of code is actually more elegant.