Variable Failure

keelerm84

I have to make changes to some code my boss wrote. I figured I would share this little gem which I've found peppered throughout.

Not3 = "3"
strSQL = "SELECT * FROM Table WHERE column <> 'Not3'"

dtech

Well, I think its brillant. I mean, this way you only have to recreate the table with a different column ordering if you want your app to do something else!

I feel for you.

here's something to make you feel better:

IF @strBasedOn = 'SYSTEM'
    BEGIN
        IF @strBuildOrVersion = 'BUILD'
            BEGIN
                IF @strUcode = ''
                    BEGIN
                        IF @strIncludeUnavailableTests = 'YES'
                            BEGIN

there are 3 values of strBasedOn, 2 forstrBuildOrVersion, either ucode empty or notand obviouslyYES, NO for the last on. For each one of these possible combinations there is a 17 line block of code where about 1 thing is changed. The total length of the section which is basically an enormous set of if-else ladders is slightly over 700 lines.

keelerm84

He has code like that in there as well. There is one portion where he's building the SQL query based off the values of four different controls. Two controls have three options, one has two, and one has five.

Instead of building the SQL the smart way, he re-writes the entire query for every possible option. That equates to 1,115 lines of hideousness.

snoofle

@keelerm84 said:

...hideousness.

That hideousness is what keeps you and I employed!

jreasons68

@keelerm84 said:

strSQL = "SELECT * FROM Table WHERE column <> 'Not3'"

So, did he ever figure out that he was getting every row?

keelerm84

Unfortunately, the code behaves exactly as he intended it. It returns every row that isn't equal to the string "3". Yeah, that's right -- he stores numbers as text.

DOA

@snoofle said:

@keelerm84 said:

...hideousness.
That hideousness is what keeps you and I employed!

... and bitter

DaveK1

@dtech said:

Well, I think its brillant. I mean, this way you only have to recreate the table with a different column ordering if you want your app to do something else!

Plus, if the value of 3 ever changes, they're ready!

Wolftaur

@DaveK said:

@dtech said:
Well, I think its brillant. I mean, this way you only have to recreate the table with a different column ordering if you want your app to do something else!
Plus, if the value of 3 ever changes, they're ready!

But then they'd have to change Not3 to Not2 or Not4 or Not42 or Not0xff. They should have thought ahead and named it NotTheNumber!

fyjham

@Wolftaur said:

@DaveK said:
@dtech said:
Well, I think its brillant. I mean, this way you only have to recreate the table with a different column ordering if you want your app to do something else!
Plus, if the value of 3 ever changes, they're ready!

But then they'd have to change Not3 to Not2 or Not4 or Not42 or Not0xff. They should have thought ahead and named it NotTheNumber!

Now why do that when you can just set not3 = 4? Bonus points if the declaration is hidden in an obscure reference file far from the actual use.

In a way it actually makes more sense than not3 = 3 when you think about it seeing as 4 is actually not 3...

Wolftaur

@fyjham said:

@Wolftaur said:
@DaveK said:
@dtech said:
Well, I think its brillant. I mean, this way you only have to recreate the table with a different column ordering if you want your app to do something else!
Plus, if the value of 3 ever changes, they're ready!

But then they'd have to change Not3 to Not2 or Not4 or Not42 or Not0xff. They should have thought ahead and named it NotTheNumber!

Now why do that when you can just set not3 = 4? Bonus points if the declaration is hidden in an obscure reference file far from the actual use.
In a way it actually makes more sense than not3 = 3 when you think about it seeing as 4 is actually not 3...

Variables should be named after how they're used, even when the use is completely idiotic. Sure, you can make Not3 contain 4, and then technically the variable name is sensible in a vacuum, except now, use of Not3 may result in getting a bunch of 3s. The code would be three times as bad if it was changed to 4 but the variable name wasn't changed. Of course, this is the sort of thing for which I often use, say. A hash if I'm working in Perl or Ruby. $status{past_due}, for example, makes more sense than $not17, in all universes.

fyjham

@Wolftaur said:

Variables should be named after how they're used, even when the use is completely idiotic. Sure, you can make Not3 contain 4, and then technically the variable name is sensible in a vacuum, except now, use of Not3 may result in getting a bunch of 3s. The code would be three times as bad if it was changed to 4 but the variable name wasn't changed. Of course, this is the sort of thing for which I often use, say. A hash if I'm working in Perl or Ruby. $status{past_due}, for example, makes more sense than $not17, in all universes.

I see the need for sarcasm tags in HTML is rising by the day... I thought the final comment about putting it as far away from the code where it's used as possible would be a giveaway :P

My point was "not3" containing 3 just makes no sense because the part that makes it not 3 is the fact in the SQL you add a <> rather than the variable, so "<> not3" to me implies "anything except not 3", so after you cross out the double-negative it should return everything with 3 (Though how you'd put such a thing in a variable is beyond me, but directly off what the name implies). I guess the main point I'm making is yes, variable names should make sense in a vacuum (Well, we can keep the context of the function, and hopefully the comment on the declaration, but when I see a variable name I don't want to have to hunt down every time it gets used to know what it's for).

PS: I'd argue that setting $status{past_due} = 17 is misleading unless you have 17 states of being due, as 17 does not really represent whether the status is past due but instead presumably when it becomes past due, so I'd probably more recommend recording a due_date or having a constant somewhere for the due_period. That said, unlike the original code I wouldn't break down and cry reading that, just flinch slightly :P

Wolftaur

@fyjham said:

I thought the final comment about putting it as far away from the code where it's used as possible would be a giveaway :P

I was tired!

@fyjham said:

PS: I'd argue that setting $status{past_due} = 17 is misleading unless you have 17 states of being due, as 17 does not really represent whether the status is past due but instead presumably when it becomes past due, so I'd probably more recommend recording a due_date or having a constant somewhere for the due_period. That said, unlike the original code I wouldn't break down and cry reading that, just flinch slightly :P

That example was out of something I inherited. Apparently booleans are bad and integers are very expensive to businesses, so there was just one "status" column that represented a lot of data. That was the example that came to mind though because it meant a lot of little snips of selecting WHERE status=7 or WHERE status <> 12, etc. But it went something like this:

New lead, not contacted.
Contacted once.
Contacted more than once.
We were asked never to contact them again.
They have requested information.
They are a client but we have not yet cashed their setup fee check.
They are a client and paid setup fee via credit card.
Their phone number changed and we need to snail-mail them to ask for a new one.

You see how that ended up at "17" rather quickly. It actually went up to 28... Actually, that in and of itself was a pretty big WTF.

fyjham

@Wolftaur said:

That example was out of something I inherited. Apparently booleans are bad and integers are very expensive to businesses, so there was just one "status" column that represented a lot of data. That was the example that came to mind though because it meant a lot of little snips of selecting WHERE status=7 or WHERE status <> 12, etc. But it went something like this:
New lead, not contacted.

Contacted once.

Contacted more than once.

We were asked never to contact them again.

They have requested information.

They are a client but we have not yet cashed their setup fee check.

They are a client and paid setup fee via credit card.

Their phone number changed and we need to snail-mail them to ask for a new one.

You see how that ended up at "17" rather quickly. It actually went up to 28... Actually, that in and of itself was a pretty big WTF.

Heh, yeah, I don't mind status enum's (Use them plenty myself - though rarely over 4 values) but that's just crazy. I mean, there's like 2-3 potential columns to space out the data in there to make it more usable and that's only with the first 8... When I saw 17 I just instinctively thought "that can't be an emum"...

I did have a client who had a 2000+ list of status's we could report to their exising with. Our side had 12 status's, and they actually ended up adding 12 new status's cause they weren't sure what'd happen if we reused the old ones (Which sorta explains how they got to 2000) =P

Wolftaur

@fyjham said:

Heh, yeah, I don't mind status enum's (Use them plenty myself - though rarely over 4 values) but that's just crazy. I mean, there's like 2-3 potential columns to space out the data in there to make it more usable and that's only with the first 8... When I saw 17 I just instinctively thought "that can't be an emum"...
I did have a client who had a 2000+ list of status's we could report to their exising with. Our side had 12 status's, and they actually ended up adding 12 new status's cause they weren't sure what'd happen if we reused the old ones (Which sorta explains how they got to 2000) =P

I personally tend to really, really hate the use of numbers for statuses that way in most cases. It ranks up there with composite primary keys as habits that need to be beaten out of common use. I like to try and use multiple booleans where that makes sense, but if there are too many senseless or potentially troublesome combinations and I really need to use just one field, I use a string... With good SQL databases this can even be fine index-wise, especially if your DB supports predicate indexes. And it's pretty language neutral, and a bit saner for incoming developers or contractors than a big list of "What the numbers mean." And if you ever have to change the scheme it's usually less error-prone, especially when said change is being done at 4AM and you're chugging Frappuccinos 3 at a time and your boss is pacing a hole in the carpet outside your office...

I once told another programmer that, though, and he took it completely the wrong way. He ends up printing out a big ASCII chart and geometrically arranging statuses on the chart such that he can categorize them by such things as whether it's in the range A-Z or a-z, or by whether the upper nybble or lower nybble is a certain range, or whether the least significant bit is set... Was worse than the 18-element enumeration it replaced. But he proudly shows me and says, "It's strings now!"

fyjham

@Wolftaur said:

I personally tend to really, really hate the use of numbers for statuses that way in most cases. It ranks up there with composite primary keys as habits that need to be beaten out of common use. I like to try and use multiple booleans where that makes sense, but if there are too many senseless or potentially troublesome combinations and I really need to use just one field, I use a string...

Composite keys... I've actually had debates on that one at work as recently as last week (I really don't like them either).

Booleans, agreed, but there's not always a boolean fact that makes sense to cover all status's. If the situation is solvable with just a handful of booleans there's no reason not to...

String's for status's though... whenever I've seen this done it's been done badly. Slightly different case, misspelt entries, same thing worded another way... with enough constraints I guess it would work. Personally I normally use a status ID, but I always make sure to add another database table which contains a list of ID's and their names and foreign key it across, so any developer who goes "What the hell are these numbers" can go "select id, description from status... oh so that's what they are!". I guess with the right constraints in place your approach is essentially the same thing denormalized.

Wolftaur

@fyjham said:

Composite keys... I've actually had debates on that one at work as recently as last week (I really don't like them either).
Booleans, agreed, but there's not always a boolean fact that makes sense to cover all status's. If the situation is solvable with just a handful of booleans there's no reason not to...
String's for status's though... whenever I've seen this done it's been done badly. Slightly different case, misspelt entries, same thing worded another way... with enough constraints I guess it would work. Personally I normally use a status ID, but I always make sure to add another database table which contains a list of ID's and their names and foreign key it across, so any developer who goes "What the hell are these numbers" can go "select id, description from status... oh so that's what they are!". I guess with the right constraints in place your approach is essentially the same thing denormalized.

The tactic I always go with for strings is that I don't type them 'directly' into code, but I'll make references to them. In Ruby or Perl, I use a hash, for example. In many other languages I'll actually create excessively simple functions that just return a static string. If I typo any the compiler/interpreter will tend to notice for me when I use that approach. If using a good SQL database I'll also add constraints to catch any big "uhoh" moments.

Composite keys... I just look at the huge amount of work for both foreign references into such a table AND the maintaining of the indexes, especially if you're talking about some of the CPK stuff I've seen: nine long strings, for example... and I'm hard pressed to consider it acceptable. And granted, unique enforcement will still be needed if those nine long strings are actually needed to be unique, but selecting via a numeric ID is always going to be a lot less work than selecting by six columns...

fyjham

@Wolftaur said:

The tactic I always go with for strings is that I don't type them 'directly' into code, but I'll make references to them. In Ruby or Perl, I use a hash, for example. In many other languages I'll actually create excessively simple functions that just return a static string. If I typo any the compiler/interpreter will tend to notice for me when I use that approach. If using a good SQL database I'll also add constraints to catch any big "uhoh" moments.

Mmm, that works fine until someone else writes a module in another language to write to your database and writes "active" instead of "Active". Imo data integrity should always be enforced by the database as much as possible, because as the guy designing the database if someone else spews random crap all over it guess who gets to clean it up :P

@Wolftaur said:

Composite keys... I just look at the huge amount of work for both foreign references into such a table AND the maintaining of the indexes, especially if you're talking about some of the CPK stuff I've seen: nine long strings, for example... and I'm hard pressed to consider it acceptable. And granted, unique enforcement will still be needed if those nine long strings are actually needed to be unique, but selecting via a numeric ID is always going to be a lot less work than selecting by six columns...

Oh yeah, nobody at my work is that bad, it was more the whole mapping table whether the unique key should be across the two foreign key'd tables or on a seperate index. I always go for giving it it's own index because if you ever extend the functionality it's a royal pain to join on multiple keys constantly.

Wolftaur

@fyjham said:

@Wolftaur said:
The tactic I always go with for strings is that I don't type them 'directly' into code, but I'll make references to them. In Ruby or Perl, I use a hash, for example. In many other languages I'll actually create excessively simple functions that just return a static string. If I typo any the compiler/interpreter will tend to notice for me when I use that approach. If using a good SQL database I'll also add constraints to catch any big "uhoh" moments.
Mmm, that works fine until someone else writes a module in another language to write to your database and writes "active" instead of "Active". Imo data integrity should always be enforced by the database as much as possible, because as the guy designing the database if someone else spews random crap all over it guess who gets to clean it up :P

Hence the comment about using constraints if I'm using a good SQL database. :)