Nulls and auto-incrementing IDs are for pansies



  • I was recently asked to link the application I am writing with a backend order processing system which was written before I joined. This order processing system is actually a series of .net executables executed from the start-bar (Services?! Get out) and one .net website that share a database.

    The part of the order processing system that I care about is an exe that reads in XML files from a directory and populated a database table or two with the information. How hard could that be?

    Firstly, when this system was written it was only processing XML files from one other external system that had already been implemented. The spec (written by a non-techy) basically boiled down to "Here is the XML file, stick that in this database". Oh yes, the development was oursourced.

    XML

    The first WTF is the XML file that is produced by the first system. It is huge, full of cruft that isn't needed, uses attributes for everything instead of elements and has lots of duplication.  A quick representative snipped sample from the XML file should suffice

    <order id="1" order_id="1" orderid="1" orderID="1">

    ..children elements (Each full of attributes)..

    </order>

    Yes, the order element has 4 different ID fields (among the 30 or so other attributes) and this happens all over the place along with inconsistent naming convensions (CamelCase and snake_case and Ihaventacluecase)

    Database NULLs

    Now, onto the database. The order table has about 130 columns, one for each XML attribute in the order and some of the children elements (Shipping address, Billing address etc.)..

    However in the original spec the author forgot to mention which fields in the XML file should be mandatory in the database. So instead of the outsourced developers asking for clarification they just made every single column NOT NULL.

    However they must have seemed to realise that having everything NOT NULL when the example XML file was full of empty attributes might be a problem but they found a great solution! I am not a .net developer but here is the pseudocode

    order = New OrderRecord

    order.foo = nullToBlank( xml_file.order.element.element.foo, 20 )

    order.bar = nullToBlank( xml_file.order.element2.element2.bar, 10 )

    Repeat that 130 times (Just for the order table, there are also 2 other tables with about 100 columns between them)

    order.save

    What this code does is look for the attribute in the XML file. If it is null then the the order attribute value is set to X blank spaces (the maximum length of the field) which is specified as the second parameter in nullToBlank. What about numbers?

    Don't worry, every field is a string. Oh yes, if the XML attribute cannot be found (Because, say, I wrote a small XML file with only the needed info) then the import silently fails. This meant I had to copy the huge crufty XML file full of junk and interpolate what I needed to.

    Database Key

    Now, I am not a database expert but I honestly cannot fathom the thought processes behind the primary key for the order table.

    The order table has a composite primary key, the first part of the key is the order ID number from the XML file. The second is...well

    There is a table X001 (Good name) that has two columns. One is today's date as a text field (YYYYMMDD) and another is a text field with a single number in it. Again I need to resort to pseudocode for this. When an order record is created the application does the following.

    key = X001.find(date == today.format("YYYYMMDD"))

    if key

      key.counter += 1

      composite_part_2 = date + counter

    else

      create record with todays date and set counter to 0

      composite_part_2 = date + counter

    end

    So the second part of the composite key is in the format YYYYMMDDX where X is a normal string field incremented by the program and if the record is not found by the (non-pk) date string, create one record and set the counter to 0. Race conditions anyone?

    I honestly cannot see how they thought that was a better idea than a simple auto-incrementing primary key

    Like I said, all this is .net and I am not a .net dev and all this is from (painful) memory so please forgive the hideous pseudocode

     

     

     



  •  @mrgaijin said:

    Like I said, all this is .net and I am not a .net dev

     That's okay, neither was the guy who wrote this in the first place.



  • @mrgaijin said:

    I honestly cannot see how they thought that was a better idea than a simple auto-incrementing primary key
     

    Well, they didn't think that, of course. They didn't know you could have auto-incrementing primary keys.

    Which means they need to be fired.



  •  Good, Cheap, Fast: pick  two.

    I'd have liked to wager a sizable quantity of monies pertaining to which ones they picked, but I doubt there's anyone willing to bet against me.



  • Wow, that's a steaming pile of fail. I wonder if many of these offshore "development" firms take pride in writing code as horribly as possible as a practical joke, because they know that cheap-ass companies will still outsource to them regardless of the quality of their output.

    @mrgaijin said:

    Yes, the order element has 4 different ID fields (among the 30 or so other attributes) and this happens all over the place along with inconsistent naming convensions (CamelCase and snake_case and Ihaventacluecase)

    This is #1 on my list of WTFs that make me feel homicidal. I think I've mentioned in some other thread that the XML for our internal application process contains such gems as <BankRuptcyfiled /> and <RelationShip />. In addition to random capitalization, we have random abbreviations. Whether these are meant to save keystrokes or disk space I can't say. We have things like <AlasCrbFsh /> for "Are you an Alaskan crab fisherman?" and <ChebiWps /> for "Do you work with chemical or biological weapons?". I love it.



  • It's more fun to solve problems that were solved 30 years ago than to solve new ones.

    If a project will take you 3 weeks of design, 4 weeks of development, and 3 weeks of testing and you outsource the development, it will take 6 weeks of design, 1 week of development, and 6 weeks of testing.  See, you save 75% of schedule and 90% of cost in the development phase.  Your design also better look like pseudocode, or you'll spend all of your time explaining the design to the developer.

    Also, intelligent keys aren't.



  • @dhromed said:

    Filed under: HINT: they didn't pick "good"

    I'm not sure they picked "fast", either.



  • @toth said:

    @dhromed said:
    Filed under: HINT: they didn't pick "good"

    I'm not sure they picked "fast", either.

    And after they've fixed all the bugs, it won't have been cheap.



  • @dcardani said:

    @toth said:
    @dhromed said:
    Filed under: HINT: they didn't pick "good"
    I'm not sure they picked "fast", either.
    And after they've fixed all the bugs, it won't have been cheap.
    The only thing left:  "two", which is what the spec told them to pick.


  • BINNED

    @mrgaijin said:

    uses attributes for everything instead of elements


    Mildly OT: I'm no XML expert but what's the problem with that?

    <order id="1"/>
    looks more efficient to me than:
    <order>
    <id>
    1
    </id>
    </order>



  • Designing XML for efficiency is like choosing a color for a machine gun to optimize friendlyness.

    Elements are preferred because they allow a lot of room for future growth without breaking existing functionality.  For example, if you wanted to add what system an order id comes from, and the order id were an element, you simply add an attribute or child element to it.  If order id is already an attribute, then it is hard to add the data in a sane way without creating incompatibility with existing code that consumes the XML.



  • @Jaime said:

    Designing XML for efficiency is like choosing a color for a machine gun to optimize friendlyness.
     

    Hrm, not bad, but it could be a bit more snappy, sort of like "...is like fucking for virginity".

    Is like designing machine guns for colors?



  • @b_redeker said:

    @Jaime said:

    Designing XML for efficiency is like choosing a color for a machine gun to optimize friendlyness.
     

    Hrm, not bad, but it could be a bit more snappy, sort of like "...is like fucking for virginity".

    Is like designing machine guns for colors?

    ... is like picking the friendliest colour for your machine gun.


  • @Jaime said:

    choosing a color for a machine gun to optimize friendlyness.
     

    Army of 2!


Log in to reply