Effective exception handling

Thuktun

I work for a company that writes relatively expensive, enterprisey Java web applications. I happened to be looking at a bit of some production code we inherited from another organization and found one of the more pointless ways to handle exceptions in Java. Yes, the comments are real.

private static String methodName(String parameter) throws Exception
{
    [do some of the work]
try
{
    <i>[do some of the work]</i>

    try
    {
        <i>[do some of the work]</i>
    }
    catch (Exception e)
    {
        throw e;
        //don't do anything here
    }

    try
    {
        <i>[do the last of the work]</i>
    }
    catch (Exception e)
    {
        throw e;
        //don't do anything here
    }
}
catch (Exception e)
{
    throw e;
    //don't do anything here
}
return result;

}

The sad part is that I was almost happy to see this. This inherited code is not of our own creation and is littered with worse examples of exception handling. These usually involve catching and effectively ignoring exceptions, allowing them to turn into null pointer exceptions or something else downstream, which will in turn be ignored and cause other exceptions, and so on.

VGR

This is what happens when you send a C programmer to do a Java programmer's job.

Mikademus

@VGR said:

This is what happens when you send a C programmer to do a Java programmer's job.

Humor parsing error.

qbolec

@Mikademus said:

@VGR said:
This is what happens when you send a C programmer to do a Java programmer's job.

Humor parsing error.

That's what happens if you send a Java programmer to write a joke.

duckie

and try to make a C programmer read it

Mikademus

@qbolec said:

@Mikademus said:
@VGR said:
This is what happens when you send a C programmer to do a Java programmer's job.

Humor parsing error.

That's what happens if you send a Java programmer to write a joke.

Well, I'm sure that somewhere inside there there's a joke trying to get out, but it's weak, lacking limbs, and is in fact stillborn, so at best it's an undead joke, which would explain why it's so horrible and stinks.

CDarklock

Are you on crack? Rethrowing the exception is EXACTLY the right thing to do. The end result of this code is the same as:

[do some of the work]
[do some of the work]
[do some of the work]
[do the last of the work]
return result;

As soon as an exception gets thrown, it propagates upward in the call stack to be handled elsewhere.

The exception handling in the code you've provided is basically just assistance to the next developer. The thought process goes like this:

"I should write better exception handling, but we're behind schedule and I don't have time; since I know someone else will come along later to add it, I'll help the new guy out a little by stubbing out the areas where it belongs."

Ungrateful jerkass. Your lack of appreciation is the real WTF.

And here's a bonus WTF: since the code is littered with "worse" examples, couldn't you have chosen a worse one? Did you think we just couldn't handle a WTF that big? "Hey, I'll post one of these to TDWTF... but just a small one, I don't want people to get TOO much WTF out of it."

But that's a Java programmer for you: they simply don't live on the same planet with the rest of us. I'll go back to COBOL before I work with Java programmers again.

biziclop

Either you've ever seen proper exception handling or failed to read the code.

What's the point in catching and rethrowing exceptions? And if there is one (there isn't, unless you also do some logging or other things in the catch block), what's the point in doing it twice?

shadowman

It's like in baseball -- you want to throw it to the cutoff man first!

VGR

@Mikademus said:

@qbolec said:
@Mikademus said:
@VGR said:
This is what happens when you send a C programmer to do a Java programmer's job.

Humor parsing error.

That's what happens if you send a Java programmer to write a joke.

Well, I'm sure that somewhere inside there there's a joke trying to get out, but it's weak, lacking limbs, and is in fact stillborn, so at best it's an undead joke, which would explain why it's so horrible and stinks.

You were right the first time. It's a humor parsing error.

This error has occurred because, for some reason, you attempted to parse it as humor. It wasn't a joke. It was never meant to be a joke. It was a barely sardonic metaphor. I didn't think it was very elusive, but evidently it led you to try to look for a joke where there was none.

So I shall now reword it in a manner that hopefully less resembles a joke to you:

This is typical output of someone who is new to Java.

Thuktun

@CDarklock said:

Are you on crack? Rethrowing the exception is EXACTLY the right thing to do. The end result of this code is the same as:

[do some of the work]
[do some of the work]
[do some of the work]
[do the last of the work]
return result;

As soon as an exception gets thrown, it propagates upward in the call stack to be handled elsewhere.

You've just said that the end result of the code is as though none of those try/catch blocks were present. What's the point, then? That's exactly the WTF.

In maintainable, production systems, you do not toss in code that has no purpose and no effect. It increases the size of the executable and adds more paths that need to be tested. A comment would be just as significant and you don't have to worry about it introducing a bug.

And here's a bonus WTF: since the code is littered with "worse" examples, couldn't you have chosen a worse one? Did you think we just couldn't handle a WTF that big? "Hey, I'll post one of these to TDWTF... but just a small one, I don't want people to get TOO much WTF out of it."

I said worse exception handling code, not worse WTFs. Surely even you can perceive the difference.

Mikademus

@VGR said:

@Mikademus said:
@qbolec said:
@Mikademus said:
@VGR said:
This is what happens when you send a C programmer to do a Java programmer's job.

Humor parsing error.

That's what happens if you send a Java programmer to write a joke.

Well, I'm sure that somewhere inside there there's a joke trying to get out, but it's weak, lacking limbs, and is in fact stillborn, so at best it's an undead joke, which would explain why it's so horrible and stinks.

You were right the first time. It's a humor parsing error.

This error has occurred because, for some reason, you attempted to parse it as humor. It wasn't a joke. It was never meant to be a joke. It was a barely sardonic metaphor. I didn't think it was very elusive, but evidently it led you to try to look for a joke where there was none.

So I shall now reword it in a manner that hopefully less resembles a joke to you:

This is typical output of someone who is new to Java.

Well, then simply say so, my language biased friend!

RayS

@qbolec said:

ou send a Java programmer to write a joke.

I thought that writing jokes was the entire PURPOSE of a java programmer?

nonDev

@CDarklock said:

Rethrowing the exception is EXACTLY the right thing to do.

Did you mean "EXACTLY the wrong thing to do"?

danielpitts

@nonDev said:

@CDarklock said:
Rethrowing the exception is EXACTLY the right thing to do.

Did you mean "EXACTLY the wrong thing to do"?

Does ANYBODY get sarcasm any more?

Thuktun

@danielpitts said:

Does ANYBODY get sarcasm any more?

If that's sarcasm, it must be the new Extreme version:
@CDarklock said:

Are you on crack? Rethrowing the exception is EXACTLY the right thing to do. [...] Ungrateful jerkass. Your lack of appreciation is the real WTF.

biziclop

@danielpitts said:

@nonDev said:
@CDarklock said:
Rethrowing the exception is EXACTLY the right thing to do.

Did you mean "EXACTLY the wrong thing to do"?

Does ANYBODY get sarcasm any more?

Not this enterprisey type.

Mikademus

I'm an earlier version and can't upgrade to Sarcasm2.0

wing

Ya know, I was worried when I saw someone post that this was not a WTF.

I'm just a lowly PHP developper, let's leave aside the fact that we (in
my company) have a structured Object Oriented MVC framework.

I program in PHP, I have seen the level of contempt this forum has for
PHP developpers. However: when I saw those try catch blocks, I thought to myself
"Those are a bit pointless. Though perhaps they're filler space for the real programmer to come along and finish." This doesn't fulfil the code sample's comments though. The programmer explicitly says "do nothing here".

MntlChaos

@Graham said:

This is actually REALLY bad practise,

for one good reason, when you throw an exception, the stack trace, etc are populated at the throw. So re-throwing the exception will overwrite the stack trace.

If i remeber correctly, 'throw;' simply rethrows the same exception... Not 100% sure, been a long time since I had to deal with java (thankfully)

Fancy debugging that :-)
I'm getting a null reference exception at line 'throw e;' !!! noooooo!!!

I admit to doing this myself (rethrowing an exception) but I learnt the hard way - in this case I was waiting on a thread and collecting any exceptions it threw, then I would rethrow them on the creation thread. Ops.

Actually, this is wrong, rethrowing an exception does not reset the stack trace. Throwing a new exception does.

[code]public class Test
{
static int x = 0;
public static void main(String[] args) throws Exception
{
x++;

try
{
if(x > 5) throw new Exception();
main(args);
}
catch(Exception e)
{
e.printStackTrace();
throw e;
}
}
}[/code]

CDarklock

@Thuktun said:

You've just said that the end result of the code is as though none of those try/catch blocks were present. What's the point, then? That's exactly the WTF.

The point is that you know there should be exception handling here, here, and here. If you didn't, you would need to figure that out yourself. Granted, it would only take five minutes or so, but five minutes here and there starts to add up.

In maintainable, production systems, you do not toss in code that has no purpose and no effect.

The PURPOSE is to illustrate where you should put your exception handling.

The intended effect is that you will think "how nice that the developer put this here for me", perceiving from the comment that the do-nothing handler is not an oversight, noting from the code itself that it serves exclusively as a placeholder, and observing from its presence that the developer clearly knew what exceptions were and how to use them.

The actual effect is that you thought "what an idiot, to write exception handlers that do nothing!" and held up the code for public ridicule. After all, no good deed should ever go unpunished.

My triage recommendation is "Fix", preferably by removing the incompatible developer. You know, the one who isn't a team player and doesn't respect or appreciate the contributions of others. So what if the code only saves you a few minutes? It saved you something, you ungrateful turd. Or did you have to waste a bunch of time looking up what "throw" does without an argument?

I said worse exception handling code, not worse WTFs. Surely even you can perceive the difference.

Since this code is perfectly reasonable, there's no WTF here at all. Therefore, either there *is* no WTF in the program, or one of the "worse" examples is a WTF which would by definition be a worse WTF than no WTF at all.

nonDev

@CDarklock said:

The PURPOSE is to illustrate where you should put your exception handling.

So WTF code is not WTF if the PURPOSE of the WTF section is to illustrate where non WTF code should go?
If you wanted to make a TODO, you could use comments - that is the PURPOSE of comments. If you wanted to comment on the fact there is no need to catch exceptions you could again use (wait for it!) comments.

I vote WTF.

Thuktun

@CDarklock said:

The PURPOSE is to illustrate where you should put your exception handling.

Until you know what exceptions you're going to handle and how to handle them, how in the world can you reliably know where to put the exception handling code? If a good developer knew that, and he wanted to leave something for future developers, he's put COMMENTS into the exception block. Better, he'd just put in comments and leave out the stubby code.

Curiously, the only comments present appear to indicate otherwise.

The intended effect is that you will think "how nice that the developer put this here for me", perceiving from the comment that the do-nothing handler is not an oversight, noting from the code itself that it serves exclusively as a placeholder, and observing from its presence that the developer clearly knew what exceptions were and how to use them.

And if that were the case, the developer would have left some comments to that effect, not "do nothing here".

Your assertion that this was deliberate is interesting. The code's
comments do not indicate this. I wonder if I've chanced upon the code's author.

My triage recommendation is "Fix", preferably by removing the incompatible developer. You know, the one who isn't a team player and doesn't respect or appreciate the contributions of others.

Right, like knowing that exceptions aren't being properly handled and knowing that someone will need to deal with that in the future, but not tagging the code with a "TODO" or "BUGBUG" or any other comments that might indicate that knowledge. Teamwork involves two-way communication, not one-way mind-reading.

Occam's Razor might be invoked here to determine which is more likely:

That someone would have intended future developers to add appropriate exception handling code, but thought that those future developers would be able to divine what needed to be done when they read "do nothing here". (Or perhaps it was a huge typo. Those keys are all right next to each other, after all.)
That someone understood exceptions only enough to be dangerous and mucked up the code.

CDarklock

@Thuktun said:

Until you know what exceptions you're going to handle and how to handle them, how in the world can you reliably know where to put the exception handling code?

By knowing what constitutes a process transaction. Enumerate communications ports, choose a suitable port, initialise it, and open the port - that's a single process. If any part of it throws an exception, the whole thing is ruined. So it goes in a try() block with a corresponding catch().

Likewise, if you're retrieving data from a remote database: create a connection, select the desired database, send the SQL, retrieve the result. Failure at any step means no result. These two processes in turn may form parts of a larger process at a higher level, which has the potential to throw its own exceptions.

And that is how you identify where exception handling belongs. You should have known that. Particularly if you're going to criticise someone else's exception handling code: before you open your mouth to ridicule someone, you'd better make DAMN sure you understand the subject yourself.

Your assertion that this was deliberate is interesting. The code's comments do not indicate this.

Nobody accidentally puts in try/catch blocks in their code because they don't know any better. The developer clearly understood what exception handling was and how it worked. He implemented handlers which did not affect the code's execution, commented to the effect that they did not do anything, and now here you come laughing about what an idiot he was.

Why? He didn't hurt anything. At worst, he has a few dozen lines of code that do nothing and are in fact commented to indicate that they do nothing. Any competent compiler will optimise them out, so they don't even take up CPU cycles. Exactly what harm has been done?

Occam's Razor might be invoked here to determine which is more likely

That's true.

1. Your predecessor was an idiot who doesn't understand exceptions, but just *happened* to write exception handlers that have zero impact through sheer dumb luck.

2. YOU are an idiot who doesn't understand exceptions, and think this zero-impact scenario "mucked up the code".

Evidence? For one, it's "Ockham", after William of Ockham. But you already knew which side I was on.

nonDev

@CDarklock said:

So it goes in a try() block with a corresponding catch().

No it doesn't. You either catch and handle exceptions or you don't. Is your code filled with calls to DoNothing() methods? Why not put an if(1=1) now and then. You know - just to show where the process might be split in the future. If I may borrow your words:

@CDarklock said:

2. YOU are an idiot who doesn't understand exceptions...

Mikademus

Before this Silly Internet Fight continues and escalates --entertaining as it might be to watch-- until where I may have to invoke Godwin's Law on ye, might I be as bold as to quote some of the original post?:
@Thuktun said:

This inherited code is not of our own creation and is littered with
worse examples of exception handling. These usually involve catching
and effectively ignoring exceptions, allowing them to turn into null
pointer exceptions or something else downstream, which will in turn be
ignored and cause other exceptions, and so on.

Given that the code snippet was, according to the author, the best of the bunch, and exception handling anderswere gave rise to terminal failures, isn't it reasonable to assume that the provided code too was in fact asinine rather than divine?

CDarklock

@nonDev said:

You either catch and handle exceptions or you don't.

And he does! He has caught an exception, and he handles it by passing it up the chain. But if he ever decides he has to do something else, the infrastructure is there. And if someone else needs to do something, the infrastructure is there.

Is your code filled with calls to DoNothing() methods?

No, but it has exception handling in places where exceptions aren't normally possible.

Why? Because things change. When someone like, say, YOU gets told to add support for some fancy new feature RIGHT NOW... well, you might be in a hurry. You might not immediately think about exception handling. But since the catch() is right there, well, maybe you'll toss in a quick handler that does something vaguely meaningful. It will be better than nothing, and it may save you a lot of headache down the road.

If you don't think that will ever happen, you cannot possibly have been in this business for very long, unless of course you have just never worked on anything of import.

CDarklock

@Mikademus said:

isn't it reasonable to assume that the provided code too was in fact asinine rather than divine?

Given that the person who wrote it has done no harm and provided a potentially helpful service to future maintainers of the code, I prefer to view it in a more charitable light. Largely because I do the exact same thing, and I know for a fact that it ultimately has value.

Of course, I'm just some random dork on the internet... but so is the original poster. So you might want to consider who's saying what here.

Mikademus

@CDarklock said:

Given that the person who wrote it has done no harm and provided a potentially helpful service to future maintainers of the code, I prefer to view it in a more charitable light. Largely because I do the exact same thing, and I know for a fact that it ultimately has value.

Well, unknown as you might be, by posting here I will assume you have at least acceptable coding skills. As such I expect you to not let exceptions slip unhandled, and especially not let system-terminal exceptions, like null pointers, slip through. According to the orginal poster this was exactly what the author of the supplied code did, though, apparently several times to boot. Given the veraity of that claim I would be inclined to assume that the coder is struggling with the very fundaments of exceptions, and that, as such, he's not trying to be helpful to his successors as much as trying to get his head around the structuring of exception handling in the first place. Then again, he seems like a n3wblar, and newbie code is seldom pretty, but we've all been there, so I'm not going to call down flames on the poor sod. Unless he fails to develop from said n3wblar state, that is.

Thuktun

@Mikademus said:

Given that the code snippet was, according to the author, the best of the bunch, and exception handling anderswere gave rise to terminal failures, isn't it reasonable to assume that the provided code too was in fact asinine rather than divine?

How about some other fun examples, less WTF-worthy but indicative of the general quality of the code?

Like trying to find a child DOM Element node without resorting to one of those pesky "if" statements all the kids are into:

NodeList nodeList = parentNode.getChildNodes ();
int numNodes = nodeList.getLength ();

for (int i = 0; i < numNodes; i++)
{
    try
    {
        result = (Element) nodeList.item (i);
        break;
    }
    catch (ClassCastException ex) { /* NoOp / }
    catch (NullPointerException ex) { / NoOp */ }
}

Or finding all sub-elements in a DOM subtree matching a given name, then re-finding them once again for each match:

int itemCount = rootElement.getElementsByTagName(itemPrefix).getLength();
for (int idx = 0; idx < itemCount; idx++)
{
    Element currentItem = (Element) rootElement.getElementsByTagName(itemPrefix).item(idx);
    //...
}

Or the bit where the
servlet handling code placed all of its transient request context data into the session
object, where multiple requests against the same session would trample
each other's data, crashing both of them. This meant that you could not, by any means, click on the same link twice before the first click finished processing, or you'd end up with a page broken in some random way.

Or the thousands of instances where a method can't effectively deal with an exception, yet it handles it anyway, logs it, then returns from the method. The caller sees the invoked method saying, "Yup, I'm good," when in fact, something is very wrong. This very often interacted with exceptions in the servlet by causing processing to continue limping along, not unlike a zombie that doesn't know it's dead, until the servlet returns some simple HTML that display a white page with the word "null" as the lone contents.

This was indeed left for future developers to address, but not in a good way.

I also really wish this forum software had a PLONK button.

CDarklock

@Mikademus said:

As such I expect you to not let exceptions slip unhandled, and especially not let system-terminal exceptions, like null pointers, slip through. According to the orginal poster this was exactly what the author of the supplied code did, though, apparently several times to boot.

It's questionable whether the original poster fully understood what he was seeing. Here's a VERY VERY BAD exception handler:

catch() { }

That's the kind of thing that turns into null pointer exceptions and God only knows what else later, although it is occasionally useful. Sometimes you rewrite parts of a legacy system. Sometimes that legacy system says "this function returns TRUE on success and FALSE on failure". If you initialise the return value to FALSE and then simply ignore all exceptions, the function returns FALSE and meets the specification. However, if you throw an exception, you break a whole stack of related code that has never expected any exceptions from this code.

Personally, I don't write that kind of exception handler because I think only specific and expected exceptions should be ignored. So here's a slightly less bad exception handler, which I WOULD use:

catch(SpecificVarietyOfException e) { }

That would be much better if it had a comment - one such handler I wrote is commented with "// leave system in default-failure state" - but it's certainly useful. I have one piece of code that specifically expects over a dozen different varieties of exception, and that makes a big ladder of catch(). I don't like the way that looks, and it would be much easier to just write one empty catch(), but I don't do that because the exception may NOT be one I expected *here*... yet still be expected farther upstream. I could give an example, but it would be rather involved.

But to get back to the point, it is something of a subtlety to understand why this next exception handler IS NOT bad:

catch() { throw; }

It only *slightly* differs from the original very very bad exception, and it certainly doesn't have any positive effect. But "not good" isn't the same as "bad". That's the fallacy of the undistributed middle. (Look it up.) And while it is certainly possible that someone might write this code because he doesn't fully understand exceptions, he does at the very least understand enough not to hurt anything.

It's possible that the previous author was honestly trying to help future maintainers. And yes, it's possible that he was just not quite sure how to write exception handling. But he did, at least, avoid doing any harm. So I'll give him the benefit of the doubt; I would rather give credit where it is not deserved than assign blame where it is not deserved. And I find it highly offensive that the original poster and several commenters actually PREFER to assign blame.

It bothers me every time I come in here and see people laughing not at the code, but at the person who wrote that code. As you quite rightly say, we ALL start out writing ugly and sometimes stupid code. If at all possible, you leave the PERSON out of the equation. The code should speak for itself. There are certainly times that the person really IS the big WTF about a scenario - the guy writing MFC code for UNIX, for example - but those cases are the exception to the rule.

I frequently have to write code which isn't the best, because I only have so much time. Sometimes I have to write a new report generator in twenty minutes, and the fastest way to do that is to copy the old report generator and pull the fastest possible hack job on it. I'd like to go back and spend a couple days fixing it later, but sometimes I never get that couple of days. Anyone out there with a real job writing production code on a tight schedule knows how that happens. And I will always be the first one to point at the hack job I did, shake my head, and say "that was terrible".

And it has always been my firm belief that only the young and the stupid blame the code on the coder. If you just have *ambition* enough to sit down and try to write code in the first place, you deserve a certain amount of respect. Writing bad code doesn't make you a bad coder. It may seem counterintuitive, but there it is.

CDarklock

@Thuktun said:

trying to find a child DOM Element node without resorting to one of those pesky "if" statements all the kids are into:

This implementation is actually sort of elegant, and I'm forced to wonder if it has advantages. It looks like it could be a performance optimisation.

But it most definitely does NOT look like someone who doesn't know what he is doing!

That's the thing that really keeps hitting me on this one: the code just doesn't look like a new guy who doesn't know his job, it looks more like a competent professional with limited time.

Or finding all sub-elements in a DOM subtree matching a given name, then re-finding them once again for each match:

This one needs some infrastructure knowledge, which I do not have because I hardly ever use Java (I'm a PHP and ASP guy). My question is this: does getElementsByTagName() cache the results?

My intuition says yes, with the cache regenerated whenever a new prefix is queried. In that case, this would be a perfectly adequate solution, because the second call to getElementsByTagName() is just indexing into the cache without wasting time or memory on making a copy.

Again, this really doesn't look like someone who doesn't know what he's doing.

Or the bit where the servlet handling code placed all of its transient request context data into the session object

That doesn't necessarily have to be bad, and without a better indication of how this changing data "breaks" the servlets (doesn't the second one get exactly what it expected?), I can't really pass judgment on it. But judging from the three examples I've seen so far, I think it's *probable* that you just don't understand something.

Or the thousands of instances where a method can't effectively deal with an exception, yet it handles it anyway, logs it, then returns from the method. The caller sees the invoked method saying, "Yup, I'm good," when in fact, something is very wrong.

Thousands? Is that literal, or superlative? I can imagine this being something in-progress... but not if there are literally thousands.

I also really wish this forum software had a PLONK button.

Ignoring those who disagree with you is a great way to learn nothing. It would be a lot easier for me to just think "you're an idiot" and never read or reply in this thread again, but I wouldn't learn anything from that, would I?

@CDarklock said:

That's the thing that really keeps hitting me on this one: the code just doesn't look like a new guy who doesn't know his job, it looks more like a competent professional with limited time.

What I find curious, though, is the comment "// do nothing here" AFTER each and every "throw e;" line. It's almost like the guy figured out five minutes ago that any code he kept trying to add AFTER a "throw e;" statement (and by that I mean inside the same catch() block, of course) somehow never gets executed...

(Or else he might know several co-workers who are not aware of that, in which case it might indeed have been a service to the next guy touching that code, as has been suggested in this thread.)

@CDarklock said:

My question is this: does getElementsByTagName() cache the results?

My intuition says yes, with the cache regenerated whenever a new prefix is queried. In that case, this would be a perfectly adequate solution, because the second call to getElementsByTagName() is just indexing into the cache without wasting time or memory on making a copy.

Yes, but... if a function, such as getElementsByTagName(), returns an array, why don't you just assign its result to some variable, after which you can index into the array later as often as you like (or need to)? In that way, you don't even have to think about whether or not the function you call might somehow internally cache the results before returning them -- remember the usual advice about not relying on internal implementation?

By the way, is it common practice in Java to write functions that internally cache their results? If so, what would be the point? If the function returns a reference to the entire array of results it found, you just need to store the returned reference in a variable, and all those results are safe from garbage collection. So caching the results internally would just seem to amount to making an unnecessary copy.

Thuktun

@CDarklock said:

@Thuktun said:
trying to find a child DOM Element node without resorting to one of those pesky "if" statements all the kids are into:

This implementation is actually sort of elegant, and I'm forced to wonder if it has advantages. It looks like it could be a performance optimisation.

Justify this statement, please. The values being compared in an if statement would already be compared in the code throwing the exception, so those comparisons have to be done anyway. Doing them twice, though, would rationally seem to be faster than doing them once and throwing an exception if either fails.

A quick test on my end seems to support this, showing that catching the exceptions and ignoring is two orders of magnitude (meaning at least 100 times) slower than pre-testing those conditions. That is, around 150 milliseconds to perform a million null and instanceof checks and around 16 seconds to perform a million ignored exceptions.

It looks like you're talking out of your ass, so I'm curious where you get this odd notion.

Thuktun

@CDarklock said:

This one needs some infrastructure knowledge, which I do not have because I hardly ever use Java (I'm a PHP and ASP guy). My question is this: does getElementsByTagName() cache the results?

My intuition says yes, with the cache regenerated whenever a new prefix is queried. In that case, this would be a perfectly adequate solution, because the second call to getElementsByTagName() is just indexing into the cache without wasting time or memory on making a copy.

It's not part of the documentation for that method, which means that anything that does cache these results would need to do so transparently for the caller. You certainly couldn't guarantee that this would occur, so you have to assume the worst. "Intuition" isn't how you code to an interface.

Also, a good programmer saves the results of method calls, since (in real languages) they could result in high-latency operations like I/O and remote invocations. Calling something multiple times when you can save the result in a variable and reuse it is always a good idea when the method doesn't involve side effects, since you don't know what the component is doing to fill your request.

In this case, many XML implementations will search the DOM tree each time you call that method. The current version of Xerces appears to use a NodeList derivative that does just-in-time searching for the element you request, but this means that you'll still end up searching increasingly more of the tree, iteratively. This implies an O(n^2) operation rather than an O(n) operation, which is NEVER something to dismiss by hand-waving unless it's non-production or involves a trivial amount of data.

That doesn't necessarily have to be bad, and without a better indication of how this changing data "breaks" the servlets (doesn't the second one get exactly what it expected?), I can't really pass judgment on it. But judging from the three examples I've seen so far, I think it's *probable* that you just don't understand something.

Do you understand why a multithreaded or multiprocess system does not store data in the same locations without some sort of synchronization? Hint, it involves data corruption.

This isn't even for the purposes of communicating between the servlet threads, this is for *temporary* data constructing web pages for each servlet request. It's pointless to store this in a shared, unpartitioned space that's always used by all threads, since it's not something you want to share with other threads and it's going to get corrupted.

Thousands? Is that literal, or superlative? I can imagine this being something in-progress... but not if there are literally thousands.

I'm not going to hand-count through 1.5 million lines of code for you, but on the order of a thousand instances of this is my estimate.

Ignoring those who disagree with you is a great way to learn nothing. It would be a lot easier for me to just think "you're an idiot" and never read or reply in this thread again, but I wouldn't learn anything from that, would I?

You've already admitted you don't develop in Java, yet you're making sweeping, authoritative statements about things you don't know well. I seriously doubt I'm going to learn anything from you, and you certainly haven't listened to any of the arguments against what you're saying, from multiple individuals. Filtering out the willfully clueless just reduces stress in the long run, and automated filtering prevents being goaded (one might even say trolled) into replying, like I obviously have.

Thuktun

@Phil the ruler of heck said:

By the way, is it common practice in Java to write functions that internally cache their results? If so, what would be the point? If the function returns a reference to the entire array of results it found, you just need to store the returned reference in a variable, and all those results are safe from garbage collection. So caching the results internally would just seem to amount to making an unnecessary copy.

Indeed, and it would also have to know when not to cache. If the tree changes, it needs to know that so the cache can be flushed. That's a lot of work just to support someone calling that method mutliple times to get the same information repeatedly and is, as you said, pointless.

CDarklock

@Phil the ruler of heck said:

It's almost like the guy figured out five minutes ago that any code he kept trying to add AFTER a "throw e;" statement (and by that I mean inside the same catch() block, of course) somehow never gets executed...

I never considered that. If that's what he felt needed to be commented... well, yeah, that's evidence of idiocy somewhere.

There's a cultural difference here; I would consider the comment to be perfectly equivalent whether it occurred before the throw (where I would put it), on the same line (where many people would put it), or after the throw (where it was placed in this example). However, that might betray a completely different thought process that I just don't quite "get" because I'm not familiar with how idiot programmers think. (I've certainly worked with a few, but I've been their boss, so I could fire them without learning their culture.)

So it may not be fair to say that this code IS an effort to help simply because that is POSSIBLE... but it's just as unfair to say that it ISN'T an effort to help because that is IMPROBABLE.

if a function, such as getElementsByTagName(), returns an array, why don't you just assign its result to some variable, after which you can index into the array later as often as you like (or need to)?

That's a good question. One potential answer is that you may not want to hold a reference to the results until the variable in question goes out of scope and is garbage collected; GC is notorious for nondeterministic memory reclamation processes.

Another answer is that you want your loop to screw up if the document's altered in mid-processing. If a node is added to the document during the loop, you never process the last node in the document. (I don't claim the developer who wrote this code was perfect, just that he was making a reasonable effort.) Why it's always the last node regardless of where the new node was added is left as an exercise for the reader.

By the way, is it common practice in Java to write functions that internally cache their results?

No, but in document processing it's extremely common to generate the same list repeatedly. Between call A and call B, the document *might* have changed - so the developer has to call the method again. However, if the document *hasn't* changed, generating the list from scratch can be very time-consuming. It's therefore common for processing engines to cache with varying complexity for this purpose; when each search for the nodes takes forty seconds, and you need to do it twenty times, but the chance of a document modification is something like 0.00001%... well, caching that result crams a 13 minute process into less than one minute if nothing changes, and just over one minute if only one change happens.

CDarklock

@Thuktun said:

Justify this statement, please.

Sometimes, if you do weird things, it's faster. The only way to be sure is to benchmark the actual code you're running under the actual conditions it is expected to encounter. (Never profile code on a virtual machine, for example. No, you can't compensate by creating a skew factor equation using the 25% figure advertised in the VM software's marketing materials. Go set up a real physical machine and profile it again.)

The values being compared in an if statement would already be compared in the code throwing the exception, so those comparisons have to be done anyway. Doing them twice, though, would rationally seem to be faster than doing them once and throwing an exception if either fails.

That's true. Now let's ask about success. If your tests *succeed*, is it faster to do them ONCE, or TWICE?

It looks like you're talking out of your ass, so I'm curious where you get this odd notion.

If you expect both tests to succeed 99.5% of the time, it's okay for your failure case to take 100 times as long... *if* your success cases take half as much time.

Just do the math. Assume you have failure code that takes 10 ms, and success code that takes 10 ms. This takes ten seconds to run a thousand times: 995 times 10 ms, and 5 times 10 ms. But if you have failure code that takes 1000 ms and success code that takes 5 ms, the same thousand runs takes 9.975 seconds (5 times 1000 ms = 5 seconds, 995 times 5 ms = 4.975 seconds).

If 1/40 of a second isn't worth the corresponding hit in code readability, a position which I would probably argue myself, then yank it out and rewrite it. But you need to have a clear picture of what the impact really is before you do that. So you need to go get your success and failure timings for both approaches, and then figure out roughly what percentage of real-world situations will use each timing. It looks to me like the code you've got here is trying to handle a case where the document is modified in mid-process (but I could be wrong); that will happen every how many documents? Two hundred? Five hundred? A thousand? A million? Make your best estimate, then do the math yourself.

And always, always, ALWAYS optimise for normal conditions - not for exceptional ones.

No pun intended.

CDarklock

@Thuktun said:

You certainly couldn't guarantee that this would occur, so you have to assume the worst.

No, you don't. You can KNOW. You can look at the implementation YOU use, and know for sure what it does, and take advantage of it.

This is a tradeoff, because it ties you to one implementation. Whether that is acceptable depends on a lot of factors.

Also, a good programmer saves the results of method calls

I would argue that a good programmer saves nothing he doesn't need. In general, if you don't know whether you'll need something, saving it is prudent. But a good programmer ought to *know* whether he needs it, and therefore whether to save it.

Do you understand why a multithreaded or multiprocess system does not store data in the same locations without some sort of synchronization? Hint, it involves data corruption.

Really?! Wow. And here I thought all those sync mechanisms were just cool accessories you added to your code, like piercings and tattoos, so other developers could see what a manly stud you were.

"You used a monitor?! Wow! That's like, *twice* as hard as a semaphore. And I'll bet that line of code is eight inches long!"

Hey, if we're going to be sarcastic, let's raise the bar.

I'm not going to hand-count through 1.5 million lines of code for you, but on the order of a thousand instances of this is my estimate.

So... you don't know how to use grep?

You've already admitted you don't develop in Java, yet you're making sweeping, authoritative statements about things you don't know well.

Hardly ever using Java isn't the same as not using it at all, and my use of Java is largely irrelevant to the fundamental principles of real-world software development. It is possible that your predecessor had real reasons to do things the way he did them. I've seen three examples of things *you* consider evidence of unbridled stupidity, and I can see a potential justification for each of them.

Whether you agree with the decisions or conditions that caused those examples is irrelevant. What is relevant, however, is that you couldn't see any justification at all for them. Your failure to account for these possibilities is a very serious flaw in your attitude toward the code you are expected to fix.

I seriously doubt I'm going to learn anything from you, and you certainly haven't listened to any of the arguments against what you're saying

Actually, I have. Earlier today, I outright admitted that someone else in this thread brought up a perspective I hadn't considered. I learned something from that, and I have no problem saying that out loud.

Furthermore, in my last comment, I addressed a serious oversight in your benchmarking methodology: you failed to account for the success case. If you don't learn from that, you have some pretty serious problems that aren't *in* the code.

And since this is a public forum, OTHER people might ALSO learn things from this discussion. Your refusal to learn does not absolve me of the responsibility to teach.

Thuktun

@CDarklock said:

If you expect both tests to succeed 99.5% of the time, it's okay for your failure case to take 100 times as long... if your success cases take half as much time.

That's why an exception is called an exception, it's something that should only occur when there's an exception to normal processing. That is, something outside the normal scope of processing. In the case in question, it was relying on the exception for normal processing. Bad mojo.

In this case, you're very likely to encounter something other than an Element as a child of an Element, namely Attribute and Text nodes. Rephrased, seeing something other than an Element as a chlid is not an exception, it's common.

I've just built a test harness that traverses all the Elements of a DOM tree from a reasonably-sized XML document (24 KB) with a number of "value" Elements with only text nodes and a number of "aggregate" Elements with only child Elements. Repeating this traversal 1000 times only took a fraction of a second using an if-guard, about a fifth of a second. Using the exception-handling method, it took 26 seconds. Again, two orders of magnitude slower. (I could post the code if someone cares, but I doubt anyone else cares about this discussion and I somehow don't think you're interested in facts.)

Your intuition appears to be deceiving you, young padawan.

If 1/40 of a second isn't worth the corresponding hit in code readability, a position which I would probably argue myself, then yank it out and rewrite it. But you need to have a clear picture of what the impact really is before you do that. So you need to go get your success and failure timings for both approaches, and then figure out roughly what percentage of real-world situations will use each timing.

You have it ass-backwards. Use if-guards unless you can demonstrate a burning need to do otherwise. Using an exception only makes sense if you need to backtrack through a large number of method calls quickly, rather than returning an error code at each level. Curiously, that's what they were designed for. You don't use it to replace an if-statement around a single method call.

How many years have you developed code that used exceptions? PHP has only had them recently, likewise with ASP[.net]. Your understanding of them seems flawed.

CDarklock

@Thuktun said:

Indeed, and it would also have to know when *not* to cache. If the tree changes, it needs to know that so the cache can be flushed. That's a lot of work

A basic caching system in a document object is a lot of work?

WTF?

Thuktun

Each time I bring up the flaws in your argumentation, you change the topic or ignore what I said. This time, you even started raving about being me being a "manly stud". I give up.

You're unreachable, so the GC should be calling soon.

CDarklock

@Thuktun said:

That's why an exception is called an _exception_, it's something that should only occur when there's an exception to normal processing.

That's what I said. What's the problem?

In this case, you're very likely to encounter something other than an Element as a child of an Element, namely Attribute and Text nodes.

That depends on your DTD, which you did not provide or describe. There do exist elements which, by definition, simply do not have attributes or text nodes. In those cases, anything other than an element as a child would in fact be an exceptional occurrence. Which is exactly why I *asked* how frequently these exceptions would get thrown in *normal* operations. If you throw any random XML document into this code, sure, it's a WTF. But what is the code designed to process and when?

You have it ass-backwards. Use if-guards unless you can demonstrate a burning need to do otherwise.

No. Use WHAT YOU HAVE unless you can demonstrate a need to do otherwise.

I'm still not entirely convinced this is a WTF, because I don't have the full context. The code only processes the first level of the heirarchy, not the entire tree. So consider an HTML document; if parentNode in your code example was the HTML element, it most definitely would be exceptional for it to have attributes or text nodes. (Technically the HTML element can have a lang or dir attribute, but that's still exceptional.) If you're processing a long list, whether a UL or an OL, that list should consist of LI elements and at most a couple attributes. The TR tag is similar, although more likely to have attributes. There are a *lot* of cases in basic HTML where it is, indeed, exceptional to encounter text nodes and attributes. The TR tag especially might be part of a massive table with hundreds of data cells in each row.

So the question comes down to the same thing: how many attribute or text nodes will the processed element have, in normal usage? How many element nodes will it have?

Are you learning anything yet? Perhaps that you are making assumptions? Perhaps that assumptions are Bad Things you should exterminate at every opportunity? Perhaps that you aren't saying everything you *should* be saying to communicate your point?

And when you come down to it, if you can process a million exceptions in 16 seconds - WHO CARES? 16 microseconds was a big deal on 3070 mainframes, but today you might as well be complaining about how somebody capitalised a function name.

Without performance benefits, however, the code is essentially in the project so other programmers will see it and go "ooh, that's clever"... which is something of a WTF. Not like I don't occasionally go "while(*s) i=i*0xA+*s++-060;" here and there, but that's much cooler. If you're going to stick WTF code in your project on purpose, it should at least be interesting.

How many years have you developed code that used exceptions?

I honestly don't know, and if I did you wouldn't believe me.

CDarklock

@Thuktun said:

Each time I bring up the flaws in your argumentation, you change the topic or ignore what I said.

Well, boo-frickin'-hoo. There are valid reasons to do everything you've shown us, and you can't explain why those reasons don't apply, so you're going to complain that I'm somehow unreasonable and refusing to listen to you.

Perhaps you should go study "argumentation" somewhere.

Oh, wait, that's not a word.

@CDarklock said:

16 microseconds was a big deal on 3070 mainframes, but today you might as well be complaining about how somebody capitalised a function name.

slower computers don't make microseconds more of a big deal. something that takes 16 microseconds on a turring machine takes the same amount of time as something that takes 16 microseconds on big blue.

Not like I don't occasionally go "while(s) i=i0xA+*s++-060;" here
and there, but that's much cooler.

that's not cool, that's stupid.

Mikademus

@tster said:

@CDarklock said:
Not like I don't occasionally go "while(s) i=i0xA+*s++-060;" here
and there, but that's much cooler.

that's not cool, that's stupid.

Nah, it looks like good code. A clearly deliminated while-loop that does it job while saving clutter. I do react against the magic numbers, but in context, they might be transparent enough. Point is, compound expressions are perfectly alright as long as they're used judiciously, If you're a Java programmer then you probably will think the above usage frightening, but on the other hand, newbies to Java find all those casts horrible. Script kiddies can turn any language into a quagmire, good programmers use the tools at their disposal in good ways.

@Mikademus said:

@tster said:
@CDarklock said:
Not like I don't occasionally go "while(s) i=i0xA+*s++-060;" here
and there, but that's much cooler.

that's not cool, that's stupid.

Nah, it looks like good code. A clearly deliminated while-loop that does it job while saving clutter. I do react against the magic numbers, but in context, they might be transparent enough. Point is, compound expressions are perfectly alright as long as they're used judiciously, If you're a Java programmer then you probably will think the above usage frightening, but on the other hand, newbies to Java find all those casts horrible. Script kiddies can turn any language into a quagmire, good programmers use the tools at their disposal in good ways.

if could at least be:

while(*s) {
(*s)++;
i = i * 10 + (*s) - 60;
}

at least that is readable.

CDarklock

@tster said:

at least that is readable.

Unfortunately, it's wrong, for three reasons.

1. I do not execute (*s)++. The postincrement operator binds more tightly than the pointer dereferencing operator, so I am actually executing *(s++).

2. I do not increment the pointer until after the calculation is done. You increment before, skipping the first round of calculations.

3. The numeric constant 060 is an *octal* constant equal to decimal 48. Virtually nobody remembers that.

Readability is not as important as correct operation. If you're trying to replace someone else's code, you should probably have a good idea of what it is doing first:

i=atoi(s);

Bonus points if you remember to record the side effect:

s+=strlen(s);

Of course, I *designed* this particular line of code to be difficult. It's the best balance I've ever found among brevity, utility, and incomprehensibility: short enough to type from memory, useful enough to put into almost any program, and very difficult for most people to figure out.

And if you already know your string is numeric, e.g. it comes out of a dialog's properly filtered text box, it outperforms atoi() by a substantial margin. Not that it really matters; both are sufficiently fast that they may as well take zero time.

WRT your observation that 16 microseconds is still 16 microseconds on a slower machine, you've somewhat missed the point. In the mainframe days, you were talking about 16 microseconds on the one and only computer in your organisation that cost millions of dollars a year to operate. Today, you're talking about 16 microseconds on one of several thousand computers that cost about as much to run as the halogen lamp in your living room. It's not that the machines are faster, it's that the machines are more readily available. The SETI @Home project would have been considered *insane* twenty years ago; the idea that computing resources might be sitting idle was a fear that actually kept MIS directors up nights.

Thuktun

@tster said:

while(s) i=i0xA+*s++-060;
if could at least be:

while(*s) {
(*s)++;
i = i * 10 + (*s) - 60;
}

at least that is readable.

Unfortunately, that isn't equivalent. The post-increment is now occurring prior to the spot where the pointed-to value is used rather than after. Also 060 is octal for decimal 48, ASCII for a '0', not a decimal 60.

The original code is pretty obviously converting a number in ASCII digits to its integral value. However, it doesn't deal well with anything that falls outside the intended range of inputs. Worse, it seems to be using hex and octal in the same expression just to obfuscate. It's converting decimal values, why use a hex value for the multiplier rather than the more natural decimal? It's converting characters to integers, why not just use the more natural character literal than its ASCII representation?

<font size="2">
int parseInt( String intString ) {
    int result = 0;
    for ( int i = 0, count = intString.length(); i < count; ++i ) {
         // convert each character to its decimal value
        int digit = intString.charAt( i ) - '0';
        if ( digit < 0 || digit > 10 )
            throw new InvalidArgumentException(
                "Not a decimal digit: " + intString.charAt( i ) );

         // shift the result left in decimal and add that digit
        result = result * 10 + digit;
    }
    return result;
}
</font>

Mikademus

@Thuktun said:

why not just use the more natural character literal than its ASCII representation?

<font size="2">
int parseInt( String intString ) {
    int result = 0;
    for ( int i = 0, count = intString.length(); i < count; ++i ) {
         // convert each character to its decimal value
        int digit = intString.charAt( i ) - '0';
        if ( digit < 0 || digit > 10 )
            throw new InvalidArgumentException(
                "Not a decimal digit: " + intString.charAt( i ) );

         // shift the result left in decimal and add that digit
        result = result * 10 + digit;
    }
    return result;
}
</font>

Well, CDarkLock already replied above, making this reply perhaps somewhat
redundant, but in your code example you are in fact doing exactly what the
original code sought to avoid (ignoring, of course, that the original code was
also designed to be confusing): you are incuring overheads. Even if the
function is expanded inline you will still have several stack operations from
setting up local variables not to talk about the implicit and explicit conditionals
and the throw. You gain readability and code stability, but in a limited environment
and a clearly defined situation, your code will execute at a mere fraction of the speed
of the original code. Also, your code example is Java (throw new), the original code is
C99 or C++.