Once again



  •   // OP: 400+ million rows in the table get returned - across a long distance network
    Map<Long,SomeComplexObject><long,somecomplexobject> map = someService.getAllOfTheseComplexObjects(...);
    if (map.size() > 0) {
    long firstId = map.keyset().toArray(new Long[0])[0];
    ...
    }
    </long,somecomplexobject>


  • Please tell me that they iterate over the values in the map looking for specific ones and only acting on those entries?

    Those are my favorite WTFs ... "DATABASE! Give me everything! I'll figure out which ones I really need!"



  • @zelmak said:

    Please tell me that they iterate over the values in the map looking for specific ones and only acting on those entries?

    I think the WTF is going to be that they fetched 400+ million rows and then discarded everything except the ID of the first one.

     



  • Geeze, are you sitting next to me or something?  The code base I inherited has this shit sprinkled throughout.

    One such example was the, "iterate through the entire collection populating a *single* object for each row in the data reader, and return only the last instance" pattern. Looked something like this:

    var o = new MyObject(); //First WTF, initialize before checking if anything was returned? Really?

    while (reader.Read()) {

     o = new MyObject(); // Uh, I think you already did this, no?

     o.[property] = reader["column"].ToString();

    // Other properties here too

    }

    return o;



  • @DaveK said:

    @zelmak said:

    Please tell me that they iterate over the values in the map looking for specific ones and only acting on those entries?

    I think the WTF is going to be that they fetched 400+ million rows and then discarded everything except the ID of the first one.

     

    Somewhere in my codebase I replaced two blocks of code: where the first block read the database with a "SELECT *", and then stored just the ids in an array, and the second block would then iterate over that array and fetch each row back out only to populate some collection object. Good times had by all!



  • @dohpaz42 said:

    @DaveK said:

    @zelmak said:

    Please tell me that they iterate over the values in the map looking for specific ones and only acting on those entries?

    I think the WTF is going to be that they fetched 400+ million rows and then discarded everything except the ID of the first one.
    Somewhere in my codebase I replaced two blocks of code: where the first block read the database with a "SELECT *", and then stored just the ids in an array, and the second block would then iterate over that array and fetch each row back out only to populate some collection object. Good times had by all!
    The original developer was just optimizing it...  It's not like there is any mechanism for indexing IDs for quick and easy retrieval, right?



  • @C-Octothorpe said:

    Geeze, are you sitting next to me or something?  The code base I inherited has this shit sprinkled throughout.

    One such example was the, "iterate through the entire collection populating a *single* object for each row in the data reader, and return only the last instance" pattern. Looked something like this:

    var o = new MyObject(); //First WTF, initialize before checking if anything was returned? Really?

    while (reader.Read()) {

     o = new MyObject(); // Uh, I think you already did this, no?

     o.[property] = reader["column"].ToString();

    // Other properties here too

    }

    return o;

    That's all over the code I maintain... but generally only if they already expected the database to return one row.  In that case, it's not so much a WTF, just unclear.


  • All: The table IS indexed, and the id is a guaranteed-unique PK.

    And yes, the guy was just using the id from the first row in the table, so roughly 400+MM rows * 3K each = 1.2+GB serialized, transported over the backbone and countless routers, deserialized, garbage collected, and all but the first 8 bytes weren't even looked at. Why? Because the routine to load the table was already there, and he didn't want to write a new routine just for this.

    Sigh.


  • ♿ (Parody)

    @snoofle said:

    Why? Because the routine to load the table was already there, and he didn't want to write a new routine just for this.

    OK, so TRWTF is your desire for him to perform premature optimization, eh?



  • @boomzilla said:

    @snoofle said:
    Why? Because the routine to load the table was already there, and he didn't want to write a new routine just for this.
    OK, so TRWTF is your desire for him to perform premature optimization, eh?
    I hear there's a cream for that...


  • Trolleybus Mechanic

    @dohpaz42 said:

    Somewhere in my codebase I replaced two blocks of code: where the first block read the database with a "SELECT *", and then stored just the ids in an array, and the second block would then iterate over that array and fetch each row back out only to populate some collection object. Good times had by all!
     

    I can do you one better. Somewhere in the codebase is a select statement that selects all the IDs, then iterates over the collection to find the largest ID, and then does an insert statement with that id + 1.

    If anyone remembers my "Max(id) + 1" wtf from a while back-- that was the improved version.


  • Discourse touched me in a no-no place

    @C-Octothorpe said:

    Geeze, are you sitting next to me or something?  The code base I inherited has this shit sprinkled throughout.

    One such example was the, "iterate through the entire collection populating a *single* object for each row in the data reader, and return only the last instance" pattern. Looked something like this:

    var o = new MyObject(); //First WTF, initialize before checking if anything was returned? Really?

    while (reader.Read()) {

     o = new MyObject(); // Uh, I think you already did this, no?

     o.[property] = reader["column"].ToString();

    // Other properties here too

    }

    return o;

    This looks like someone ran through FxCop, and got complaints about o not being initialized before use, and just put in a superfluous instantiation to shut the thing up. After all, the gc will clean it up, right?



  • @FrostCat said:

    @C-Octothorpe said:

    Geeze, are you sitting next to me or something?  The code base I inherited has this shit sprinkled throughout.

    One such example was the, "iterate through the entire collection populating a *single* object for each row in the data reader, and return only the last instance" pattern. Looked something like this:

    var o = new MyObject(); //First WTF, initialize before checking if anything was returned? Really?

    while (reader.Read()) {

     o = new MyObject(); // Uh, I think you already did this, no?

     o.[property] = reader["column"].ToString();

    // Other properties here too

    }

    return o;

    This looks like someone ran through FxCop, and got complaints about o not being initialized before use, and just put in a superfluous instantiation to shut the thing up. After all, the gc will clean it up, right?
    FxCop?  Sadly no...  This is a pattern of bad technique and simply not understanding that permeates the (old) codebase.


  • @snoofle said:

    And yes, the guy was just using the id from the first row in the table, so roughly 400+MM rows * 3K each = 1.2+GB serialized, transported over the backbone and countless routers, deserialized, garbage collected, and all but the first 8 bytes weren't even looked at. Why? Because the routine to load the table was already there, and he didn't want to write a new routine just for this.
    Does this guy still work there? I've got this blunt rusty knife I was going to throw out, but I could always send it to you.


     



  • @Lorne Kates said:

    @dohpaz42 said:

    Somewhere in my codebase I replaced two blocks of code: where the first block read the database with a "SELECT *", and then stored just the ids in an array, and the second block would then iterate over that array and fetch each row back out only to populate some collection object. Good times had by all!
     

    I can do you one better. Somewhere in the codebase is a select statement that selects all the IDs, then iterates over the collection to find the largest ID, and then does an insert statement with that id + 1.

    If anyone remembers my "Max(id) + 1" wtf from a while back-- that was the improved version.

    You should have rewrote it to introduce a custom bubble sort algorithm, as well as a reverse ordering when that is finished.

    Our system can do donation processing for user-specific fundraising. The fundraising table has only one unique key - the auto-increment id - and no GIGO checking. The query to find the specific user's fundraising page does a check for a naturally-non-unique value that legitimately could happen between distinct users, as well as the same user, and does no limiting of the return values - it assumes that only one row could ever possibly be returned, no matter what. The database layer has a function called 'fetchRow' that only expects to get back one single row, and for some reason will throw an exception if a query is passed that returns more than one row. So if somehow more than one record shares the lookup value (which is a concat of a person's first and last name), then it just dies a horrible, ugly death. It's not like the database could be set to have a unique non-primary key, or the program could actually do some checking for uniqueness, or even that the 'fetchRow' function could just simply ignore every row except the first. It's somehow an unrecoverable situation that must halt all processing.



  • @FrostCat said:

    This looks like someone ran through FxCop, and got complaints about o not being initialized before use, and just put in a superfluous instantiation to shut the thing up. After all, the gc will clean it up, right?
    FxCop? No, it's the compiler complaining that o might not have been initialised.


  • Trolleybus Mechanic

    @dohpaz42 said:

    The fundraising table has only one unique key - the auto-increment id - ... So if somehow more than one record shares the lookup value (which is a concat of a person's first and last name),
     

     

    Easy change. Concat the auto-increment id to the concat of first_name + last_name, and you'll have a guarenteed unique identifier!



  • @DOA said:

    Does this guy still work there? I've got this blunt rusty knife  [ . . . ]

    Quick, put it away, don't let the testosterone-crazed macho gun nuts from the front-page thread catch sight of that!  They're just desparately looking for someone they could blow away and claim it was genuine self-defense rather than the impotent-rage-driven sadistic self-gratification it would appear at first sight to be...




  • @Lorne Kates said:

    @dohpaz42 said:

    The fundraising table has only one unique key - the auto-increment id - ... So if somehow more than one record shares the lookup value (which is a concat of a person's first and last name),
     

     

    Easy change. Concat the auto-increment id to the concat of first_name + last_name, and you'll have a guarenteed unique identifier!

    That would assume that I knew the id of the row in that table in the first place, which I don't. The non-unique "unique" name is supposed to be my identifier. What gets me is that this concatted name is stored in the user table - it's generated when the user is created (which, btw, their personal info is stored in a different table than the main user table). And for some reason when the fundraising table was created, it never occurred to the original developer to use the id from the user table as a foreign key... maybe they were prematurely optimizing by reducing the need for a join? Yes, because their way is SO MUCH better.


Log in to reply