I've inherited a WTFworthy project - help me with keeping two sets in sync



  • Hi all,

    This week I started working on a new project at work. After successfully completing a different project, management decided to give me something that was a lost cause for them.

     
    Basically,  the want me to add editing feature to their customer list... through a web interface (think of it as CRUD). Problem is the data structures that the original dev used.  There are two different database entries (mediumblobs in MySQL). First one contains THE WHOLE list of CUSTOMER_IDs (delineated by CR/LF) and the second one contains the list of customer names, addresses, phone numbers etc. For reasons unknown to me, consultant decided on this crazy structure. Graphically, it looks like this:

    List of Customer IDs (CUST_ID)
    -----------------------------

    0898789
    1287777
    4747777
    7166666
    ...

     

    and the second mediumblob contains a giant list of names, addresses, phone numbers etc:

    Second list - CUST_INFO
    -------------------------------------

    Joe Smith; 789 Avenue; 873-555-5555
    Jane Smith; 87 Elm; 872-555-5555
    ....

     

    Now, to find the customer ID of Jane Smith, you count the line where her name appears in CUST_INFO list (in this case 2nd) and then you find the second entry in CUST_ID list... in this case it's: 1287777

     

    WHY THEY EVEN USE A DATABASE FOR THIS IS BEYOND ME! Obviously, this was implemented by someone who wanted to get it done as quickly as possible so what they did was to take the original list and just dump it as a single entry in the database.

     
    Anyway, my question is following... since I cannot change the schema (it would break too many things), how should I implement editing? They want me to allow someone to  open this list in a page and for example insert a new customer ID into CUST_ID list. To make this work, I would then have to go back to CUST_INFO list and modify it there as well. But how to figure out where they made an edit is beyond me...

     Anyway, if anyone wants to offer some advice, please do. I hope I don't have to quit my job over this nonsense :(




     


     



  • If you cannot change the schema, get booze, a new job, whatever. There is no use in trying to somehow work with this truely WTFy schema.



  • Change the schema, and fuck anyone who gets in your way.  Sure, it might look hard or tedious now; but if you don't change it now you are just going to get fucked even harder down the road.  Or maybe you will get lucky and get hit by a bus, and the next schmuck in line will be "promoted" to this illustrious project.  If people would just do it right the first time we wouldn't have to suffer through this non-sense.  So do yourself and your successors a favor and fix your predecessors' mistakes sooner rather than later.  If the powers that be won't let you fix it, threaten to quit.  If that doesn't work, burn the building down.  I am not joking.



  • @luke727 said:

    Change the schema, and fuck anyone who gets in your way.  Sure, it might look hard or tedious now; but if you don't change it now you are just going to get fucked even harder down the road.  Or maybe you will get lucky and get hit by a bus, and the next schmuck in line will be "promoted" to this illustrious project.  If people would just do it right the first time we wouldn't have to suffer through this non-sense.  So do yourself and your successors a favor and fix your predecessors' mistakes sooner rather than later.  If the powers that be won't let you fix it, threaten to quit.

    I would see why this is a lost cause...and I tend to agree with the statement here. If you don't change it now you are almost certainly apart of the problem. Get yourself a test db and test app, change what you ought to, fix what breaks, repeat. You will spend less time doing that than trying to get a proper insert/update/delete to operate properly on that dataset. Especially when you think of a NL getting into that dataset as either appended to a name, or prepended some how...or worse, something after that NL (I know, proper data cleansing and checking). But if you don't ditch this now, you'll be seen as supporting it, forever!

    @luke727 said:

    If that doesn't work, burn the building down.  I am not joking.

    BTW...Don't forget your Swingline

     

    Final thoughts:

    If they really won't let you fix it, write a procedure that will rebuild that original (crap) dataset upon proper insert/update/delete of a real dataset...

    i.e.

    Original App -> Original Data

    Insert/Update/Delete customer enhancements -> New (proper) Data -> Trigger (either actual DB depending on version, or in code is fine) Update old crap dataset

    Then, rewrite each portion to use the proper dataset until the old is not used any longer.



  • @rewind said:

    Final thoughts:

    If they really won't let you fix it, write a procedure that will rebuild that original (crap) dataset upon proper insert/update/delete of a real dataset...

    i.e.

    Original App -> Original Data

    Insert/Update/Delete customer enhancements -> New (proper) Data -> Trigger (either actual DB depending on version, or in code is fine) Update old crap dataset

    Then, rewrite each portion to use the proper dataset until the old is not used any longer.

    Yes, that seems like a sensible solution. 

    Create new table(s) to replace the two blobs.
    Write a simple application to update, insert or delete  customers into the new table(s).
    After each change to the new table(s), perform two batch "insert from select" of all records and rebuild the two blobs completely.

    If there are a large number of customers and it takes too long to do this after each operation, do the synchronization offline when no one is updating the database.
     



  • umm report back tht after your initial analysis it is not possible to create the added features they requested and that if those features need to be implemented a partial rewrite of the system will be needed.

    Then let them decide if it''s worth it. 



  • @stratos said:

    umm report back tht after your initial analysis it is not possible to create the added features they requested and that if those features need to be implemented a partial rewrite of the system will be needed.

    Then let them decide if it''s worth it. 

    No, tell a whole rewrite of the system is needed, which will increase the chance of you been taken off such project, and if they do keep you that project, they should allot more time for you and give you more credit for getting it done sooner than expected ;)



  • Can't you build your own schema and do some CSV'ing or something?

    Keeping this madness is crazy ofcourse...



  • This is probably more of a communication issue than an engineering issue(despite how awful the latter is).  You should probably feel out management's perception of what's in this project.  For example, they may be fully aware that the implementation is a WTF from Hell.  If they're aware of that, and they just want you to crank out the old college try, then go with a simple solution:  read the value, do a Split() on CRLF, run your array manipulations, and Join() by CRLF to update.  Yeah, it has concurrency issues, but if there's only going to be one person using it, then resist the urge to over-engineer.  Think of bigger problems, like world peace, hunger, third world debt, and what's going to happen on the next episode of "American Idol".

    The other case may be that management is unaware of why the change they want would be difficult to do right.  Do some explaining, make sure they're fully aware of the caveats, and ask for a judgement call.  That's what they're there for - you're just supposed to make sure they get what they ask for and make sure they're clear on you not getting whipped for any poor risk decisions they make.


Log in to reply