Okay, several people argue “against” me with my own arguments and I need to quote long pieces of text to maintain context, so this thread seems to have about run its course… anyway, here goes:
@dhromed said:
How can anyone in his/her right mind write a comment that does not summarize what the bit of code does?
Well, I’m coming back to this. Someone (dhromed or someone who agrees with him) explain to me how this doozy from Kasia’s weblog came about:
//select query formed. MEADOW _ID 2067 NEED TO CHANGE IT IF IT DOES NOT EXIST IN DATABASE.
selectQuery = "SELECT LOCATION FROM MEADOWS WHERE MEADOW_ID=" + meadow_id;
//Update Query formed .MEADOW_ID 2067 NEED TO CHANGE IT IF IT DOES NOT EXIST IN DATABASE.
updateQuery = "";
The second comment is obviously lying. How did the programmer manage to do that? Don’t ask me, but clearly, these comments that having nothing to do with what the code next to them is doing do exist, in production code no less.
@Chris F said:
@Aristotle Pagaltzis said:If something involved is going on, regardless of how simply you write it and whether you use an inherently expressive language, the code will always require concentration and careful reading to understand.I don’t think your conclusion follows from your argument. I think it’s more likely that certain algorithms require concentration and careful reading to understand, and that it is the algorithm – not the code – that is hard to understand. To assert that all code is hard to read because some algorithms are involved is certainly an error. To assert that all code is involved and complex is also an error.
Point granted – it’s mainly the algorithm that’s hard to read.
But I don’t think that really negates what I’m saying. What I’m asserting is that expressing a complex algorithm in code inherently introduces a layer of concreteness that shrouds the intent of the code. I’m not sure I can explain this adequately. By way of an example, take an algorithm such as “go from home to the bus station” – for a computer, you have to express this as a series of concrete instructions, so you end up with something like “starting at the location home, while your location is not the bus station, determine all adjacent locations, select the one which is closer to the bus station, move there, and repeat.” The reader has to deduce the abstract intent from these concrete instructions. That’s what goes on when you read code: you need to deduce, by tracing execution from one concrete instruction to the next, what the code is really intended to do. That’s why code is hard to read – it conceals, by the very virtue of being instruction code, what its purpose is. More expressive languages help (I find well-written Perl very natural to read), but they too cannot excise this.
@Cyresse said:
@Aristotle Pagaltzis said:A piece of code about 20× as long as your example that actually does something is about the minimum size of a unit of code you need to understand all at once in a typical program that does more than one simple thing. (And I don’t mean in Java – I mean in dense languages like Perl, Python, Ruby, or the likes.)
Dude. Do you code in Python? I can write the SMTP/IMAP handling section of an email client in 30 lines in Python, 5 of which are comments. My GUI runs to about 50 lines, including comments.
Okay, so the example if referred to was 8 LoC, and your mailer is a total of 80 lines, so my estimate was off by a factor of 2. Not all that bad, if you ask me, and the exact factor would depend on the kind of application.
Do you really need me to comment -
imapObj.login(user, pwd)to tell you that imapObj is an IMAP connection object, and you’re logging in using a username and password? I find that kind of commenting either insulting or annoying.
Whereas, I will comment something like this -
data = imapObj.fetch(msg, 'RFC822.TEXT')[1][0][1]Because it’s not immediately obvious that an imap object returns a tuple of
(status code, ((some junk, actual data), ')' )(And no, I have no idea why it returns like that.
I’ve been meaning to subclass imaplib.IMAP4 for awhile now.)I’ll also comment what the hell RFC822.TEXT is as well.
Comments should clarify. If you can’t understand what a function does without every single line being commented, why are you reading code?
Uhm, did you read the thread? Those are my points exactly. :-) I’m the one who has spent all this time arguing that excessive commenting is harmful and that commenting in general should be done with extreme care.
For the record, data structures should be documented thoroughly. Thanks for reminding me of this subtlety: good code means investing a lot of effort in good data structures, so that the code has little work to do. If the data structures are exhaustively documented (not commented), the code will need relatively little in the way of clarification. C.f. Fred Brooks.
@Jonex said:
@Aristotle Pagaltzis said:How large is the most complex program you’ve ever written?Not very large. When I have a complex problem I’ll usually divide it into smaller, less complex pieces. That way I make the actual source easier to read, instead of solving the problem in a complex and making the comments do the abstraction.
Wrong argument, really. Even with a broken-down codebase, you need to keep enough in your head at once to understand the interactions. And my assertion that a non-trivial unit of code comes to about 150-200 LoC even in a dense language is, I think, hard to deny.
Obviusly not all code is easy to read. But my point is that good code should also be easy to read, otherwise it isn’t good code. (sometimes you have to make exceptions though, of performace reasons for instance) No matter how much comments you put in it.
I think we have a semantic confusion here. When I say “easy to read” I really mean “easy to understand”. With that in mind I defer to my response to Chris F, above, about this point.
Anyway, this is really all moot, because…
A good general rule is that when you feel the need of adding comments, check if you can make the code clearer by changing variable names or refactoring or in other ways change your code instead. That will not only make the code readable but probably more maintanable as well.
Well, like Cyresse, if you actually read the thread through from top to bottom, you would find that I have been arguing precisely this all the time.
Note, though, that I’m coming from a Python programming perspective. Python’s syntax is made with readability in mind, moreso than many other languages. So I’m sure there can be a larger need for comments in less readable languages. (Assembler as the most obvius example.)
I’m coming at this from the Perl programmer perspective.
Yes, really. :-) The same syntactic freedom that is frequently abused for clever-looking or obfuscated code can also be used as an effective tool to improve the clarity of code. It just takes discipline. (And I resent tools that presume that I lack discipline; hence Perl, and not Python. If you really want to torture me, make me write Java. But Perl is not without flaws, I just like it better, warts and all. To each his own.)