Oh, how I despise hashcodes..



  • Continuing the discussion from There should be a "programmers license"...:

    @presidentsdaughter said:

    Please don't! Hashcodes. I'll write about them later...

    So, I already mentioned that our "embedded" system talks to outside world using SNMP protocol, and the convoluted ways of my predecessors.

    One of the things the system had to do was translate SNMP OIDs (quick summary: bunch of numbers meaning tables, atributes, and indexes, like this 1.3.6.1.1.0.2.1.3.4.5 -> table: 1.3.6.1.1.0.2.1, atribute/column 3, row/index 4.5), in device internal representation addresses.

    A bunch of rules exist, automatically extracted from appropriate definition files and they, while being very different from what I present below, mostly consist of the following info :

         Atribute:IndexFormat     DeviceType  Idx Memory Offsets         Ranges
    
         1.3.6.1.1.0.2.1.3:x.y    DeviceTypeA  1  addr1=x*1000 addr2=y-1 x:[1..3] y:[5, 6]
         1.3.6.1.1.0.2.1.3:x.y    DeviceTypeA  2  addr1=y*2000+x addr2=0 x:[4..8] y:[1..8]
         ....
    

    ###The Storage

    On system startup, or when DeviceTypeA becomes available, its related file is read and parsed (code which was another clusterfuck of its own), and the info is stored in two HashTables (which in turn are inside other HashTables...): oidToDeviceIdx and deviceIdxToOid

    • deviceIdxToOid stored the key (Idx) as Integer, and the value as a pair (Attribute:IndexFormat, [Formulas]
    • oidToDeviceIdx stored the key as Integer, and the value also a triple (Idx, [Formulas], [Ranges])

    I give you a couple seconds to process this and to wonder why I don't like hashcodes.

    Yes, oidToDeviceIdx key was:  "1.3.6.1.1.0.2.1.3:x.y".hashCode()
    ####Handling collisions
    We got that! We post-process the rules files, and add a extra tag so no repeated strings exist.

      1.3.6.1.1.0.2.1.3:x.y!S1    DeviceTypeA  1  addr1=x*1000 addr2=y-1 x:[1..3] y:[5, 6]
      1.3.6.1.1.0.2.1.3:x.y!S2    DeviceTypeA  2  addr1=y*2000+x addr2=0 x:[4..8] y:[1..8]
    

    But but, how then would you find anything?

      for(i=1; i < MAX_TAGS; i++) {
        String key = oid + "!S" + i;
        Object[] val = oidToDeviceIdx.get(key.hashCode());
        if(val != null) {
          return val;
        }
      }
      return null;
    

    And then, as the system grew a little, completely unrelated Attibute:IndexFormat entries were resulting in hash collisions also.

    Then the code became:

      String key = oid + "!S" + i + getExtraInfoFor(oid);
    

    Extra info meant the DeviceTypes loaded and related with this oid.

    #####TL;DR
    So-called developers use a string hashcode as the key, instead of the original object.
    I deleted the whole package, and started from scratch. Even boss man saw the mess we had and green-lighted me to do it.

    As the time goes by, I'm starting to have a more comprehensive view of the system, I start to understand @snoofle and his legendary Side Bar stories.. At least we don't have SQL databases.



  • Just because you're using hashes wrong doesn't mean hashes are bad.


  • Discourse touched me in a no-no place

    There's a lot of misunderstanding about hash codes, their calculation, etc. Even most of the people who write hash functions don't grok things well. You can actually get away with some really nasty tricks in many common cases (e.g., where you're only hashing between things in the same class) and get good performance.

    As long as you're using linked lists on the back end of your hash table (e.g., as happens in java.util.HashMap) then the only consequences of getting it wrong are a performance drag.

    Make sure you get equality right.


  • BINNED

    @presidentsdaughter said:

    Even boss man saw the mess we had and green-lighted me to do it.

    Obligatory TDWTF derailment: I can no longer read the phrase "boss man" without thinking of that TDWTF radio drama and want to give it another listen.

    You know what? I officially support @apapadimoulis to put ads on the site. I will not run any ad blockers, and I will click something at least once a day. All I want is an assurance that at least 1% of the proceedings will go toward making more of those.


  • BINNED

    @Onyx said:

    will click something at least once a day.

    All he has to do is to hide it under a hart icon or a 'Like it too' link and it will get some clicks when dropped in a certain topic.



  • I didn't see a deer, elk, moose or antelope icon, nothing even remotely close to a hart.


  • ♿ (Parody)

    @Onyx said:

    that TDWTF radio drama

    That was so fun, and turned out really well. I hope @Lorne_Kates is up for doing another one.



  • Provided he doesn't have to use Discourse, I suspect he'll probably be up for it.



  • java.util.HashMap actually fails over to a tree now, if possible, for overburdened buckets.

    Source: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/util/HashMap.java#1785


  • Discourse touched me in a no-no place

    Interesting, that's a change in Java 8 (which I've not upgraded to yet). It also makes the code much more complicated, and increases the overhead of the data structures hanging off the back of the hash table itself.



  • Sounds like your predecessors really made a hash of things.


Log in to reply