Major change in zero time



  • New job, new team, new project, lunch with new coworkers and folks from partner-team, discussion topic: the partner team needs to figure out how to implement major changes (new columns, split existing columns over multiple columns, etc) to an existing database table with *lots* of rows.

    Issues:

    -too much data to slurp into ram at once
    -not enough time in change-window to bulk export to a file, stream edit and bulk import
    -can't use a stored proc because of conflicting internal policies about auditability of data transformations (a plethora of wtf'ery in and of itself)

    What to do?

    Proposed solution:

    Since they don't have time to export it, and can't crunch it in place (the rule is that transformations must involve: extract row, transform row, log old row, log new row, replace old with new, and you are explicitly precluded from doing it within a stored proc as that's considered doing it "under the table"), they will:

    - update the table to include the new columns, and a temporary column: dataHasBeenTransformed, initialized to 'N')

    - change the code so that it checks the "dataHasBeenTransformed" flag: if "Y", then new-logic. if "N", then read old format, convert, log, insert back into db wth transformed flag= "Y", then new logic.

    - they can tell management that the transformation will take less than one minute (alter table to add new columns)

    - eventually, once all the rows have been updated, special logic and the flag column can be removed

    Mind you, these programmers know the right way to do it, but the bureaucracy is precluding all avenues of common sense.

    Even when programmers know how and want to do it right, management forces wtf'ery upon us. *sighs*

     



  • @snoofle said:

    -too much data to slurp into ram at once
    -not enough time in change-window to bulk export to a file, stream edit and bulk import
    -can't use a stored proc because of conflicting internal policies about auditability of data transformations (a plethora of wtf'ery in and of itself)

     

    Agreed on that third point.  I had a client that went through SOx recently, and they weren't anywhere near that retarded.  In particular, why are they claiming that a stored proc is any less auditable than a straight query or series thereof (hello, what do you think stored proc's contain)?

     



  • I'm not sure whether to admire the ingenuity, or vomit.



  • @emurphy said:

    ....  In particular, why are they claiming that a stored proc is any less auditable than a straight query or series thereof (hello, what do you think stored proc's contain)?

    Personally, I think transaction-log == audit-log (they're kept for one year, as opposed to text log files that are tossed monthly), but I'm not in charge of compliance. I'm too new here to understand that sort of thinking.


  • Considered Harmful

    It's a slippery slope.  They may ask you to make more and more of these changes, in which case, you would need to have a dataVersion column instead of a dataHasBeenTransformed column.  Then if you hit a row with version 2, and the current is 5, you can run transformation from 2->3, 3->4, 4->5 all behind-the-scenes.

    Shooting yourself in the head is also an option.
     



  • /me runs to hug and kiss management, who listens to us as the experts.



  • I like the solution, it's great if you don't have time/info to batch convert everything online.

    Like for example switching password hashing algorithm. 



  • @joe.edwards@imaginuity.com said:

    It's a slippery slope.  They may ask you to make more and more of these changes, in which case, you would need to have a dataVersion column instead of a dataHasBeenTransformed column.  Then if you hit a row with version 2, and the current is 5, you can run transformation from 2->3, 3->4, 4->5 all behind-the-scenes.

    Shooting yourself in the head is also an option.
     

    Funnily enough, one of the tables in the LiveJournal database has a dversion column which does more or less what you'd expect. (Of course, it doesn't upgrade the data on the fly; instead, it keeps the code to handle all the past versions and has a seperate program to do the transformations when the admin gets round to it.)



  • @snoofle said:

    New job, new team, new project, lunch with new coworkers and folks from partner-team, discussion topic: the partner team needs to figure out how to implement major changes (new columns, split existing columns over multiple columns, etc) to an existing database table with lots of rows.

    Issues:

    -too much data to slurp into ram at once
    -not enough time in change-window to bulk export to a file, stream edit and bulk import
    -can't use a stored proc because of conflicting internal policies about auditability of data transformations (a plethora of wtf'ery in and of itself)

    What to do?

    Proposed solution:

    Since they don't have time to export it, and can't crunch it in place (the rule is that transformations must involve: extract row, transform row, log old row, log new row, replace old with new, and you are explicitly precluded from doing it within a stored proc as that's considered doing it "under the table"), they will:

    - update the table to include the new columns, and a temporary column: dataHasBeenTransformed, initialized to 'N')

    - change the code so that it checks the "dataHasBeenTransformed" flag: if "Y", then new-logic. if "N", then read old format, convert, log, insert back into db wth transformed flag= "Y", then new logic.

    - they can tell management that the transformation will take less than one minute (alter table to add new columns)

    - eventually, once all the rows have been updated, special logic and the flag column can be removed

    Mind you, these programmers know the right way to do it, but the bureaucracy is precluding all avenues of common sense.

    Even when programmers know how and want to do it right, management forces wtf'ery upon us. *sighs*

     

     

    why don't you write a stored procedure to print out the necessary queries to a text file... circumvent their attempt to avoid automation of the task 


Log in to reply