Pretty Aggressive Code



  • So my first submission is pretty meta, but it just has to be asked...

    Have you ever come across some pretty aggressive code? Not code that is destructive, or buggy, or even poor performing, but rather where there is no explanation other than the developer had to specifically code a certain behaviour into the application? I'll share my latest encounter of this...

    A colleague of mine had to create a SQL view of client data for another set of developers, who had prescribed the columns they wanted present in the view. Standard stuff. He wrote the view, and sent it to the other devs. During testing, it quickly came about that they needed to see what type of client each record represents. No big deal, there's a ClientType field, and my colleague offered to add it to the view.

    Their response: "No. Don't add another column to the view. It will break our process."
    "Process" being the hereto unknown jargon for "routine that consumes the view".

    Really? Break the process? I can't help but think that this "process" includes code which:

    • Queries the view (potentially a Select * From ClientView)
    • Count the number of columns returned (Bonus points if it compares the names of the columns to what it expects)
    • If the above yields a number different to the actual amount of columns it wants, then throw an exception.

    The logic is sound for when too few columns are returned, but too many?!

    So the workaround to this (based on the other dev's suggestion): Prepend the ClientId column's value with the Client Type. But not the integer value stored in the field. Nope, join to the lookup table and get the first character of the display value. I haven't even mentioned the fact that the ClientId is a Guid.

    So in short, because they coded an aggressive routine in their application, we can't add an additional column, but we also now have to butcher a Guid to look something like P-29ADE5D6-3EEB-43AC-8D5A-0F0F3EFDCB7D

    At this point, I am without words...



  • Why would this be called an "aggressive code"? Never heard of that phrase before.

    Also, why the hell couldn't they just change the number of expected columns in the result set? The only thing I can think of is, they have something like TTwentyColumnResult deep in the bowels of some precompiled dll, but not TTwentyOneColumnResult.



  • I consider it aggressive because one actually has to write this kind of code. It doesn't just happen by accident :smile:



  • I wouldn't call it aggressive.

    Aggressive is when you check for a given condition (for example, that required parameters are present in a given dictionary) using all possible ways the developer could think up:

    1. Check for every key in the dictionary
    2. Concatenate all keys, sorted, and check against the hardcoded string that represents the same
    3. Check if the number of keys is not less than the required one
    4. Given the string in (2), run it against every known message digest algorigthm and see if it matches
    5. Reverse the string in (2) and check against another predefined constant
    6. In case of any mismatch, explode the facility, expose BDSM photos of CEO, launch a nuclear missile

    This is aggressive as I see it.


  • Notification Spam Recipient

    [code]aggressive[/code]


  • BINNED

    Not pretty enough. Needs more cornify


  • Notification Spam Recipient


    [code]aggressive[/code]

    FTFY



  • That would be aggressive against the users, but I feel my code is aggressive against other developers:

    /**
     * Authenticates a user.
     * Sam, if you change username for email again I will hunt your family 
     * and eat their skin after its dry like a dorito.
     */
    function login(username, password) {
        ...
    }
    

  • Fake News

    That's not agressive code, it's an agressive comment.


    Filed under: Also, now I'm hungry for dorito's



  • @wft said:

    I wouldn't call it aggressive.

    Maybe if it called the user a BITCH or tried to SPANK you when you added the column to the view?


  • Winner of the 2016 Presidential Election

    [code]
    function login(username, password){
    var Sam;
    // Lets hope this never has to be called!
    if(Sam == changeCodeAgain()){
    function(){ var I_Will_Eat_Your_Family = "After they are dry, also like Doritos"; }
    }
    validate_User(username, password);
    }
    [/code]

    Filed Under: FTFY?



  • Correct me if I'm wrong, but you're saying there's a view, and their process should be outputting the view in some simple format (tabular, one assumes) by just looping rows and columns.

    You mentioned they said the process would be broken with an extra field, indicating that the process expects a static list of fields.

    If I understood correctly, you're asking whether this WTF categorises as 'Aggressive'.

    @cartman82 said:

    Never heard of that phrase before.

    Neither have I, but if we're going to categorise WTFs (which might well be a constructive exercise for preventing them), let's consider the following:

    • The process sounds generally not fit for purpose, having apparently relied on data it should just be able to get from the view.
    • Maintainability is obviously lower than optimum, which automatically makes it bad.
    • One infers the process has been developed without much thought, but no time constraints have been indicated.

    Could it be that no one thing being wrong with it (by my count, everything is wrong with it, assuming I've understood correctly) makes it 'aggressive'?

    I invite your thoughts.



  • @Shoreline said:

    ... their process should be outputting the view in some simple format (tabular, one assumes) by just looping rows and columns.

    I'm still trying to find out exactly what it is that the process does with the data. The amount of pushback given to the extra column seems to indicate that it is more than just a simple case of presenting the data...

    @Shoreline said:

    One infers the process has been developed without much thought, but no time constraints have been indicated.

    Time constraints around here are often just a load of WTF... Business usually spends months devising how they want to redesign their operational ways, but give us only a fraction of the time to implement. But I'm too jaded already to be bothered by that.

    @Shoreline said:

    Could it be that no one thing being wrong with it (by my count, everything is wrong with it, assuming I've understood correctly) makes it 'aggressive'?

    The aggressive portion is that this doesn't happen by accident. If you only want columns x, y, and z, then your query should be Select x, y, z From ClientView. What I've inferred is that they are doing a Select * From ClientView and interrogating the results.

    In hindsight, perhaps this isn't so much "aggressive" as it is overly defensive. Although this does remind me of that quote "The best defense is often a good offense"...

    Also, and this is the part that hurts me professionally, the way that Guid columns have to be cast to string, to accommodate the extra column's info. So not only do we lose type safety, there is an additional burden on the process to break it up into 2 different pieces of data.

    It baffles me how that is preferable to adding support for an extra column in the process. If I do find out what exactly is the root cause of this WTF, I will be sure to share it... :smile:


  • BINNED

    @AgentDenton said:

    overly defensive. Although this does remind me of that quote "The best defense is often a good offense"

    And this is certainly offensive.



  • @AgentDenton said:

    The aggressive portion is that this doesn't happen by accident. If you only want columns x, y, and z, then your query should be Select x, y, z From ClientView. What I've inferred is that they are doing a Select * From ClientView and interrogating the results.

    I don't see what the big deal is. At worst, stick the column at the end so they only have to read a new value, assuming they're checking by column position.

    @AgentDenton said:

    It baffles me how that is preferable to adding support for an extra column in the process. If I do find out what exactly is the root cause of this WTF, I will be sure to share it...

    It seems like they had to write code or something to do this extra work, so why not the little bit of extra work (presumably a lot less) by having someone else update the view.



  • I don't get the term "aggressive" here. Code doesn't have motives.

    It seems like you're just asking about dumb workarounds you've had to do to make morons with shitty products happy?



  • This is another good example of why the senior engineer at one of my old jobs didn't allow SELECT * in our code.



  • @AgentDenton said:

    Also, and this is the part that hurts me professionally, the way that Guid columns have to be cast to string, to accommodate the extra column's info. So not only do we lose type safety, there is an additional burden on the process to break it up into 2 different pieces of data.

    I think that you've made a wrong assumption that they even care about the contents of the ClientId column at all. Since they are too lazy/incompetent/whatever to change their code for the new column, it is very likely that they simply take the string contents of ClientId and use that as-is to store into their local database. As far as they are concerned, they need no changes because the data will just go into a text field and use string compare in their code (probably with case sensitivity too!).

    In other words, if the client type changes in that compound string, I'm willing to bet that the client will be treated as a brand new client on their end.

    I have seen something similar happen before, and now when I see an idea like the one proposed here the alarm bells go off. That, after having had to clean up the mess when another group didn't understand that an internal user ID needs to be unique and immutable for as long as the user exists (at least in our system).



  • The default behavior of SSIS is to validate that the schema it is accessing hasn't changed in any way. If you change nullability or field lengths, it completely stops transferring data with a validation error.

    So, this "aggressive validation" you describe is standard practice for at least one major commercial ETL tool. Hardly strange enough to leave someone speechless. However, the fact that you are going forward with a workaround instead of modifying the process is a WTF



  • @Shoreline said:

    Correct me if I'm wrong

    You don't need to ask for that around here :) We'll correct you even if you're right.

    @AgentDenton said:

    A colleague of mine had to create a SQL view of client data for another set of developers, who had prescribed the columns they wanted present in the view. Standard stuff. He wrote the view, and sent it to the other devs. During testing, it quickly came about that they needed to see what type of client each record represents. No big deal, there's a ClientType field, and my colleague offered to add it to the view.

    Their response: "No. Don't add another column to the view. It will break our process.""Process" being the hereto unknown jargon for "routine that consumes the view".

    Here's what confuses me. How could they have a process that requires a set of parameters (columns), that doesn't provide them sufficient information (ClientType)? That process must be new, because it couldn't possibly have worked before if it wasn't getting the ClientType somehow. So have they always done this mess with the ClientType, or are they just reluctant to change their new interface for some obscure reason?



  • @Jaime said:

    The default behavior of SSIS is to.....

    Bingo! [and a WTF to anyone who regularly works with SQL databases and did not think of this]


  • Discourse touched me in a no-no place

    @cartman82 said:

    Also, why the hell couldn't they just change the number of expected columns in the result set?

    Maybe they export it as CSV and then import that back into some spreadsheet that does calculations with the results, with that being driven by cells in the next column of the spreadsheet. All on a wooden table, natch.



  • @TheCPUWizard said:

    Bingo! [and a WTF to anyone who regularly works with SQL databases and did not think of this]

    There are more SQL databases than MS-SQL, you know. I didn't even know what SSIS was until I googled it, but I work with Oracle databases so that's not too surprising.



  • @Scarlet_Manuka said:

    SQL databases than MS-SQL

    I did mean to say SqlServer..... can I blame it on Discourse?? :smiley:



  • @Scarlet_Manuka said:

    I didn't even know what SSIS was until I googled it

    For the first few years of being a TDWTF member, I just assumed it was a misspelling of SSDS.



  • @AgentDenton said:

    Their response: "No. Don't add another column to the view. It will break our process.""Process" being the hereto unknown jargon for "routine that consumes the view".

    Really? Break the process? I can't help but think that this "process" includes code which:

    Queries the view (potentially a Select * From ClientView)
    Count the number of columns returned (Bonus points if it compares the names of the columns to what it expects)
    If the above yields a number different to the actual amount of columns it wants, then throw an exception.

    I've seen third party providers do that BS: We had a badge printing application that required a simple one-row-per-badge table that only permitted views with pre-designated columns. They didn't care about the names though, only position/content. My guess was it was coded to use SELECT * FROM view, but the fetch was apparently hard-coded: we couldn't provide a column list, we could only name the view and database connection to be used as a source. Since it was third party, we couldn't make it better.



  • Not allowing SELECT * in any circumstances is as bad as allowing it freely to my eyes.

    I mean, the idea is good and I'm considering doing a few ctrl+f's in my codebase to have a talk with the junior developers, but pretty much every project that doesn't has a big focus on performance might have a pair of SELECT * to simplify code.

    Or I might be completely wrong and people will derail this topic because of me.


  • Discourse touched me in a no-no place

    @Husky said:

    Not allowing SELECT * in any circumstances is as bad as allowing it freely to my eyes.

    Well, it's a pretty good flag for bad practice, especially when not coupled to a good WHERE clause. A sign that someone isn't understanding what a database is or why they should use one.



  • I think the primary reason that he didn't like it is that only looking at the query doesn't tell you what fields the query returns. Obviously that can be solved with an IDE that supports reading the schema from a database, but this was almost 10 years ago, and I don't know if any IDEs at the time could do that. Maybe Visual Studio could, but we weren't a Microsoft shop.



  • @Husky said:

    Not allowing SELECT * in any circumstances is as bad as allowing it freely to my eyes.

    Aside from quick and dirty ad hoc stuff, the only time I use it is when I have a series of CTEs and I'm filtering those for various purposes, and so I've defined the column list in one place already, so I don't need to keep multiple lists in sync.



  • @TheCPUWizard said:

    Bingo! [and a WTF to anyone who regularly works with MSSQL databases and did not think of this]

    Some of us use other dialects of SQL which SSIS does not grok...



  • @Dragnslcr said:

    I think the primary reason that he didn't like it is that only looking at the query doesn't tell you what fields the query returns.

    This and the code smell are the main problems I see.

    How frequently do you need all columns in the table, though? In all my projects, that ratio was around 20% of all queries, up to nearly 90% in reports. We're probably doing something very wrong in the database design according to this topic.

    Boomzilla's method makes a lot of sense though.



  • The main application for SELECT * I see is applications that are introspecting table structure -- this can be used to good effect to allow your application to adapt dynamically to schema changes.


  • Discourse touched me in a no-no place

    @tarunik said:

    The main application for SELECT * I see is applications that are introspecting table structure -- this can be used to good effect to allow your application to adapt dynamically to schema changes.

    Fair enough, but you'd definitely want to have a suitable WHERE or limit clause there (limit clauses vary in syntax between SQL dialects, alas) because you really don't want to haul every row of a multi-million-row table back into your application by accident. Not that that doesn't stop some idiots from trying (and that was my point…)



  • Yeah:

    1. my app doesn't have multi-million-row tables, thankfully
    2. we have to do some operations row-by-agonizing-row, sadly :/

  • Discourse touched me in a no-no place

    @tarunik said:

    multi-million-row tables

    Better than multi-million column tables! (Some users…)



  • @dkf said:

    @tarunik said:
    multi-million-row tables

    Better than multi-million column tables! (Some users…)

    Ugh, thanks for triggering the flashbacks. 500+ columns in a table because the original developer didn't know about normalization. There was person1_name, person1_phone, person2_name, person2_phone ... person20_name, person20_phone.



  • @Dragnslcr said:

    Ugh, thanks for triggering the flashbacks. 500+ columns in a table

    I'm using one (in Filemaker) that is close to 1000. Calling "SELECT * ..." from some C++ code across ODBC results in an out of memory crash. But the program does what it needs to, so I honestly don't care what the db looks like. (and my c++ code just gets exactly what it needs instead of being lazy)


  • :belt_onion:

    Except they fixed it by prepending shit to a GUID, which makes it a varchar of some sort, which would break the SSIS package as well. So I doubt that is the issue...



  • @darkmatter said:

    Except they fixed it by prepending shit to a GUID, which makes it a varchar of some sort, which would break the SSIS package as well. So I doubt that is the issue...

    And herein lies the bedrock of my WTF post. So even if we can't increase the number of columns, we end up changing the meaning of others, and changing their data type as well.

    Oh, and yes, this is SSIS. Sure, SSIS falls over if you change the schema. This was confirmed by my buddy at the office, since my SSIS experience is limited.

    So instead of only refreshing the schema to use a new column (which they would need to do anyway if you change a UniqueIdentifier to VarChar), they now have to add some work to break the ClientType-ClientGuid hybrid apart. Very unnecessary IMO...

    And this is brand spanking new stuff, so the other WTF is why they could not just change it to accommodate their request to receive more columns...

    :confused:


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.