"It never did that before!"



  • So, three years ago I started on a WCF service, which is consumed by user-facing applications which are developed by other teams. Said service went into production a few months later, second version came out months after that, and since then it has only experienced minor updates. Until a few months ago.

    After a new version was deployed to the testing environments, one of the client app testers informed us of a "new" bug, where a list of items in the response was not coming back in a predictable order. We said, "the specs don't dictate the response should be ordered in any particular way. Use XPath to find the element you want." "But...but..but... It's always come back in the order we expect it to!"

    We relented, in order to avoid causing a big fuss over it, and changed the only part of the code we changed for the new release so that it would order items (in that layer, at least) in a specified way. Turns out this didn't fix it.

    Today, I started looking at some of the other layers, which we didn't touch at all, and found the culprit... A piece of code I initially checked in on August 8, 2008. Before our first version went live.

    So... Is TRWTF that I wrote code that resulted in things be ordered unpredictably (it was threading-related), or that their application was probably broken for three years without them knowing it?



  • Definitely the latter.  Happens all the time here...

    "There's a problem with your service!"

    "What's the problem and how long has it been happening?"

    "Our logs show at least 5 months.../We don't know, we just started logging/We dont' know, we don't have logs"



  • TRWTF is that the other team didn't put in a request to have the behavior clarified or made predictable.


  • Discourse touched me in a no-no place

    @ShatteredArm said:

    Is TRWTF that I wrote code that resulted in things be ordered unpredictably (it was threading-related), or that their application was probably broken for three years without them knowing it?
    Neither. It's this:
    @ShatteredArm said:
    the specs don't dictate the response should be ordered in any particular way
    in both that it wasn't explicitly stated (or even a statement to the effect of "that which is not explicitly stated is undefined,") and the other team relying on undocumented behaviour..





    I've had similar problems in the past which has resulted in the following snippet in the specification I'm currently writing regarding the configuration of some new software I'll be writing soon:

    • It's not an error to have a different number of [names of things using tunnels] and tunnels, but to do so will result in undefined behaviour. Do not rely on perceived behaviour under these circumstances.
    • It's not an error to cause IP addresses or ports to overlap/occur more than once either within a pool or across a pool, but, again, it will result in undefined behaviour. Do not rely on perceived behaviour under these circumstances.
    • No assumptions should be made about the order of assignation between the tunnels and the [names of things using tunnels].


  • @ShatteredArm said:

    So... Is TRWTF that I wrote code that resulted in things be ordered unpredictably (it was threading-related), or that their application was probably broken for three years without them knowing it?
     

    For trivial stuff like this, you, as implementer, always add some sort of ordering, even if it's just A-Z or id desc, even if it's not explicitly in the spec.You are human, yes?

    For slightly less trivial a stuff, you may decide to let the user determine sequence, even if it's not explicitly in the spec, as long as it makes sense.

    For nontrivial stuff, you let it slide.

    I really, really, really wish our CMS' structured content section had active/inactive and item ordering by default. Now I basically keep adding it myself all the time, which is quite dull. This type of content is always user-editable, and for nearly everything, one can conceive a client wanting to deactivate an item, or put item X before item Y.

    It doesn't always make sense, though, but it would be awesome if our CMS supported it by default.



  • @dhromed said:

    For trivial stuff like this, you, as implementer, always add some sort of ordering, even if it's just A-Z or id desc, even if it's not explicitly in the spec.You are human, yes?
    SQLServer will tell you it's a lot faster when you do not use `order by`. In fact, in some situations `order by` can seriousely slow down a query.


  • Fake News

    True, if you don't use ORDER BY, then SQL Server will give you records in clustered order. However, if you need to use ORDER BY, then you should make damn sure to index the ORDER BY fields, especially on tables with lots of rows. That's just fundamental database schema shit.



  • @dhromed said:

    For trivial stuff like this, you, as implementer, always add some sort of ordering, even if it's just A-Z or id desc, even if it's not explicitly in the spec.
     

    Oh, dear.  I guess I'm just too used to a class of industries where there are serious ramifications for adding features - even sensible ones - that aren't in a requirements specification. If you need a feature you better go add it to the requirements documents, then design documents, then you can put it into code.  Of course, in these industries if you don't have a requirements spec you're in trouble to start...lawyers would have a field day.

    This is similar to the philosophy, "Code cannot be requirements because code only tells you what it does, not what it's supposed to do."



  • @Dorus said:

    @dhromed said:

    For trivial stuff like this, you, as implementer, always add some sort of ordering, even if it's just A-Z or id desc, even if it's not explicitly in the spec.You are human, yes?
    SQLServer will tell you it's a lot faster when you do not use order by. In fact, in some situations order by can seriousely slow down a query.

    In this case, the order was off because of multithreaded processing of the data. For all we know, he was using "order by" in the query that fetched the data.

    In other words, adding a specific order to this data could have been a significant amount of code, as he would have to wait for the multithreaded processing to complete, then re-order it from scratch without relying on the database engine to do it for him. I agree that in this case, you shouldn't bother unless it's in the specs.



  • @blakeyrat said:

    @Dorus said:

    @dhromed said:

    For trivial stuff like this, you, as implementer, always add some sort of ordering, even if it's just A-Z or id desc, even if it's not explicitly in the spec.You are human, yes?
    SQLServer will tell you it's a lot faster when you do not use `order by`. In fact, in some situations `order by` can seriousely slow down a query.

    In this case, the order was off because of multithreaded processing of the data. For all we know, he was using "order by" in the query that fetched the data.

    In other words, adding a specific order to this data could have been a significant amount of code, as he would have to wait for the multithreaded processing to complete, then re-order it from scratch without relying on the database engine to do it for him. I agree that in this case, you shouldn't bother unless it's in the specs.

     

    Well, in this particular case, ordering the items was a trivial exercise, as it was just a list.  The threading was introduced three years ago because it originally had to go make an expensive web service call for each element, and there were constraints on total execution time, so the list was built asynchronously.  Later the web service call was removed, but not the asynchronous code.

    Perhaps the list should have been ordered after all the calls were complete, but it wasn't in the specs, and we figured the clients would actually interrogate the list to find the data they wanted, rather than assuming they were in a certain order.



  •  Q: Did they specify what order this time?, and added it to any desing? Or do they just want "the random order we had last 3 year"?



  • @blakeyrat said:

    In this case, the order was off because of multithreaded processing of the data. For all we know, he was using "order by" in the query that fetched the data.

    In other words, adding a specific order to this data could have been a significant amount of code, as he would have to wait for the multithreaded processing to complete, then re-order it from scratch without relying on the database engine to do it for him. I agree that in this case, you shouldn't bother unless it's in the specs.

     

    100% agree.  It probably would have involved an additional OrderedQueue type of thing running on yet another thread.  Imposing order in concurrent tasks is non-trivial, and we've been in a concurrency-based world since the prior millenium.  Ergo, if the client needs ordering, it needs to be in the requirements.

     



  • @lolwhat said:

    True, if you don't use ORDER BY, then SQL Server will give you records in clustered order.

    Not true.

    If two sessions are executing the same query, SQL server can start to return the first query's progress to the second one if it makes sense to do so. 

    The fact you can never, ever assume order unless you specifiy it is as true now as it's always been.



  • @LoztInSpace said:

    @lolwhat said:

    True, if you don't use ORDER BY, then SQL Server will give you records in clustered order.

    Not true.

    If two sessions are executing the same query, SQL server can start to return the first query's progress to the second one if it makes sense to do so. 

    The fact you can never, ever assume order unless you specifiy it is as true now as it's always been.

    +1. I missed that from lolwhat. It would be better to say that if you don't use ORDER BY the order is undefined.

    However! Most SQL implementations still "accidentally" end up with the data ordered by insertion time, which tricks people into thinking that will happen every time and isn't just an accident. SQL Server should have an option where if you're querying small (< 400 records) tables, and no ORDER BY is specified, every 5th query it does ORDER BY NEWID() to randomize the order. Then idiots building database apps wouldn't assume, they'd look up the actual spec.



  • @blakeyrat said:

    However! Most SQL implementations still "accidentally" end up with the data ordered by insertion time, which tricks people into thinking that will happen every time and isn't just an accident. SQL Server should have an option where if you're querying small (< 400 records) tables, and no ORDER BY is specified, every 5th query it does ORDER BY NEWID() to randomize the order. Then idiots building database apps wouldn't assume, they'd look up the actual spec.

    You're assuming that a future developer might test something more than 4 times.  I'll test any edge cases I can think of, but no more than once each, before handing over to the user for "testing."

     

    Note: "testing" is equivalent to me saying "Please test it and get back to me" but they take this as "It's finished, don't bother me for 5 months until you've left the company and someone else actually clicks on it to discover that it's in the wrong order and calls me to ask for a new report to be created and gets 2 weeks into specifying the new report before saying 'Well actually we just need this report in a different order.'"



  • Also, outside the database, you have to just be aware of what kind of data structures you're populating. Just a few days ago I ran into an issue when I was stuffing data into a plain hashtable (Java's HashMap) instead of a hashtable backed by a linked list to guarantee that iteration order was the same as insertion order (Java's LinkedHashMap). I'm not sure why I didn't type the Linked prefix - probably out of habit/muscle memory..

    The thing was, during unit tests, the data sets were sufficiently small and boring that the iteration order was indeed coincidentally the same as insertion order. Whoops. At least it's an easy one to fix.



  •  If you use the LinkedHashMap only to make your testcases happy, something seems wrong. I've used mocking before to prevent these problems (after all, you want to test your class, we trust HashMap to work correctly in java).

     I might be a nitpick, but i do not accept codechanges on the behalf of adding testcases, unless my code actualy improves from these changes (witch it does 90% of the time, other 10%, powermock to the rescue).


  • ♿ (Parody)

    @Dorus said:

     If you use the LinkedHashMap only to make your testcases happy, something seems wrong. I've used mocking before to prevent these problems (after all, you want to test your class, we trust HashMap to work correctly in java).

     I might be a nitpick, but i do not accept codechanges on the behalf of adding testcases, unless my code actualy improves from these changes (witch it does 90% of the time, other 10%, powermock to the rescue).


    Did you read what he wrote? The test cases were passing with an incorrect implementation (i.e., the test cases were not sufficient).



  • I don't read, i write. Beside, he wrote his testcases did not test his (incorrect) implementation troughly enough for the differnce between HashMap and LinkedHashMap to show up. In this case he couldn't trust HashMap to work correctly in java. Oops.



  • @Dorus said:

    I don't read, i write. Beside, he wrote his testcases did not test his (incorrect) implementation troughly enough for the differnce between HashMap and LinkedHashMap to show up. In this case he couldn't trust HashMap to work correctly in java. Oops.


    ???

    The function I was working with takes some Collection of ObjectXs as input, does some stepwise processes, and finally outputs a Collection of ObjectYs for each ObjectX that was correctly processed. For reasons you'll just have to trust me on, a plain old list of Pairs wouldn't be a good idea because the intermediate steps occasionally needed to search through the Collection for effectively random entries to decide on its ultimate behavior. Rather than linear searches, I just used a hashtable instead, because that's what they're there for. Further, the intermediate steps needed to be well modular and de-coupled, so rather than doing all the searches up front and streaming on object at a time, a step would work on the group, pass the group to the next step, and so on, since upstream wouldn't know what downstream needs. Now, the order of the input needed to be the same as the order of the output for presentational purposes. If I wanted to use a hashtable for logarithmic searches, I'd have to use the LinkedHashMap implementation. It's not really a big deal, I was just musing on a similar aspect as the OP but outside of the database scene. Are you okay with reading my words yet?



  •  Reading.. reading. I'm reading now. I'm working on understanding. Give me some time on that.

     

    Now I'm now wondering why there is no interface in java that gurantees iteration order by instertion order. Would be nice to be able to write <? extends InsertionSortedIterable> to prevent these problems by default. Seems crazy you can make a function misbehave if you enter a hashmap into it instead of a linkedhashmap. I know there is a SortedSet, but that one is based on the set sorting the data instead of keeping the insertion order.

     

    See, i thought about it.


  • ♿ (Parody)

    @Dorus said:

    Seems crazy you can make a function misbehave if you enter a hashmap into it instead of a linkedhashmap.

    Why does this seem crazy? He was assuming the behavior of one thing, but used another. You might as well say, "Seems crazy you can make a function misbehave if you use addition instead of subtraction."



  •  I'm talking about java here, not his function. I'm missing a OrderedMap or something like that. Some way to enforce in the contract of a method  that you expect a map with a order.

    I just realize there are ordered collections, namely List<E>. ListMap<K,V> however does not exists.

     Edit: Looks like i wasn't the first one that thought of a OrderedMap, wonder why the IterableMap didn't make it into java 1.6.


  • ♿ (Parody)

    @Dorus said:


    I'm talking about java here, not his function. I'm missing a OrderedMap or something like that. Some way to enforce in the contract of a method that you expect a map with a order.

    I just realize there are ordered collections, namely List<E>. ListMap<K,V> however does not exists.

    Then you're still not reading very well. @TFM said:
    Hash table and linked list implementation of the Map interface, with predictable iteration order.
    And your writing is pretty bad, too.


  •  LinkedHashMap is a implementation of a Map<K,V>.

    Some map implementations, like the TreeMap class, make specific guarantees as to their order; others, like the HashMap class, do not.

    I know this spesific implementation does gurantee the order. But there is no Map interface that enforces this. Theres is one that enforce some iteration, there is one that enforce a sorted iteration, there is none that enforce a stable iterator.

     


  • ♿ (Parody)

    @Dorus said:

    I know this spesific implementation does gurantee the order. But there is no Map interface that enforces this. Theres is one that enforce some iteration, there is one that enforce a sorted iteration, there is none that enforce a stable iterator.

    Is that your final answer? That it's crazy that there's no interface for this?



  • Yeah, i think i'm going to stick to that as my final answer.


Log in to reply