Wednesday, December 19, 2012

Commenting Code is Teaching

You have named your classes, functions and variables in most expressive and descriptive way possible. All your classes are single responsibility, functions are short as to fit onto a single screen (relative as that is) and variables are never reused. The code is formatted according to the standard and it’s aesthetically impeccable (at least according to you). When all this is done why would you put any real effort into writing comments? Your version control system will reject a check-in if you don’t write comments on your public, internal or protected members but that’s just for show, as you can always simply put “TODO: Comment” and the system will happily accept it. There, policy satisfied and who cares: it must have been some clueless apparatchik that came up with it. Your code is so obvious that it doesn't need any comments.

The above description is a caricature but there is more than a grain of truth in it. The code is obvious at the time of coding only because your mental model at the time is clear and (after a lot of debugging) matches what the code is doing. What code lacks is a way to communicate the actual intent behind it and whomever follows you in reading it will lack this crucial information. But more importantly, even if you had a way to perfectly communicate intent through code, what your code will never have, indeed cannot have is what at the end wasn't coded. The coding is an art of not only what has been written but also what hasn't been written. We constantly decide what to put in and what to leave out. We write, delete and rewrite all the time, searching for most optimal solution given our constraints. Then constraints change and we do it all over again. At the end we settle on a solution to exclusion of all others and we have the complete rationale for it. This rationale is one of the most important things we can teach to others and this teaching is best done through comments, alongside the code it talks about. But a rationale should explain not only what was left and why, but also why everything else was excluded.

Beman Dawes, the founder of boost.org, comments:

Failure to supply contemporaneous rationale for design decisions is a major defect in many software projects. Lack of accurate rationale causes issues to be revisited endlessly, causes maintenance bugs when a maintainer changes something without realizing it was done a certain way for some purpose, and shortens the useful lifetime of software.

Rationale is fairly easy to provide at the time decisions are made, but very hard to accurately recover even a short time later.

The tough part is that even when we know all this, our mental model of what code does is too accurate for simpleminded commenting. We have to step outside of ourselves, of our own thinking and comment code for others (including our future selves). And so to comment is to put oneself into other’s place and try to teach others about what’s important and what’s not important, about what we decided to leave in and what we decided to leave out and, above all, why these decisions were made. This is why we should write comments in our code – not because of check-in policies but because our job isn't only what has been coded but so much more and we should be, if nothing else, honor bound by our craft to try to teach those that follow us and make their work easier.

I treat all unclear, incomplete or (Turing forbid!) missing comments as defects during code reviews. And I expect to be called out on any lack of clarity and comprehensiveness in my own comments. It’s not perfect as our coworkers are often steeped into the same issues and can thus understand more than a random future maintainer. But it’s the only way to ensure that the code and comments we write complete each other and that 10 years from now they will still be meaningful.