Magic Number Hell



  • So, this app I'm maintaining does some analysis on input data and, as part of this analysis, returns condition flags regarding whether or not some part of the analysis triggers a given condition.

    When the original developers started out in this, they did it in a bitfield style (they were C programmers doing Java development) such that the first condition was bit 1, the second condition was bit 2, and so on. Bit-wise logic easily handled whether a given condition flag was set.

    Somewhere along the line, additional conditions were added, but since (apparently?) the person who added these new conditions on, didn't understand the bitfield implementation (or liked prettier numbers) decided to extend the functionality by simply adding a larger decimal number to the existing bits. So, for example, if the first flag were set and the new flag was also set, the decimal value would be 17001 ... each new flag bumped that decimal number up.

    Confounding this further, they decided to make certain magic numbers mean something NOT related to the bits in the condition flags. For example, 16386 means no condition flags were set and 16384 means bad data was found (although, since this is a power of two, it may be a condition flag itself.) Oh, and for good measure, this combined conditional flag value was stored in the analysis results table ... along with a string/varchar version of what the code means, looking like any of the following:

    Condition1/Condition2/Condition3
    Condition1_AND_Condition2_AND_Condition3
    Condition1+Condition2+Condition3
    Condition1_OR_Condition2_OR_Condition3

    Since there were so many possible combinations, someone built a spreadsheet to keep track of all this mess, enumerating all possible values and their corresponding 'user-readable' values. Which means that we now have the definition for the values existing in TWO places -- in code and in this manually built spreadsheet.

    Recently, the users of this system gave us a requirement to be able to search by individual condition flags instead of using the combined conditional flag magic numbers to look stuff up -- its hard to find all possible combinations of magic number where the first condition is turned on.

    The lead developer's solution? Break out all the possible combinations of the combined conditional flags into their separate parts and list them in a table in the database. For example, if we have:

      int Condition1 = 0x01;
      int Condition2 = 0x02;
      int Condition3 = 0x04;
      .
      .
      .
      int Condition24 = 17000;
      int condition25 = 18000;
      .
      .
      .

    the corresponding table would look like:

    CFLAG  CNAME
    1      Condition1
    2      condition2
    3      Condition1
    3      condition2
    4      Condition3
    5      Condition1
    5      Condition3
    6      Condition2
    6      Condition3
    7      Condition1
    7      Condition2
    7      Condition3
    .
    .
    .
    17000  Condition24
    17001  Condition1
    17001  Condition24
    .
    .
    .

    Reasoning? Now the values are more data-oriented ... the system is now more flexible. The query GUI now has a drop-down chooser for unique values of CNAME in the table, above. The original query can now be joined with the choice from the dropdown and this table looking up the analysis records where the CFLAG contains CNAME.

    So, now we have THREE places where the values for these condition flags are defined: code, spreadsheet, and database.

    And they're all authoritative ... except when they aren't ... ultimately, code wins because the combined conditional flag values are, of course, hard-coded in the analysis code.

    That's a heckuva WTF, you say? That's only the beginning...

    I've been tasked with fixing it (going back to the bit-wise representation of individual condition flags) and unscrewing this system.

    I've already decided to use an Enum and Sets of that Enum's values to determine the value of the condition flags.

    The hard part is converting every analysis tool from their old number (some of the analysis tools are written in C and called from Java in a 'command-line' like 'exec' environment) and wrapping them with this new number.

    How would you write an 'adapter' or converter to convert these magic numbers from 'old style' to 'new style'? A huge mapping table in code? A config file with the mapped values to be read in at runtime?


  • Discourse touched me in a no-no place

    @zelmak said:

    How would you write an 'adapter' or converter to convert these magic numbers
    from 'old style' to 'new style'?
    Personally? I wouldn't. Break every dependant tool in one go and see what/who complains about the result of GIGO. This smells of 'depending on undefined behaviour.'


    Of course, you're not to be granted that option.




    Under the current 'system' is it possible for any single number to have two representaions? (i.e. instead of a one-one relationship that there should be, is there at least one-many/many-one?)



  • What is this I don't even



  • Has anyone started calling these "tragic numbers" instead of "magic numbers" yet?

    My favorite broken variant is when people take the traditional admonition against magic numbers and do stuff like

     

    /* initialize some values */
    float value = F_ZERO;
    

    Like they couldn't get enough of the SNES game or something. Makes me want to cringe.



  • @PJH said:

    @zelmak said:
    How would you write an 'adapter' or converter to convert these magic numbers from 'old style' to 'new style'?
    Personally? I wouldn't. Break every dependant tool in one go and see what/who complains about the result of GIGO. This smells of 'depending on undefined behaviour.'

    Of course, you're not to be granted that option.


    Under the current 'system' is it possible for any single number to have two representaions? (i.e. instead of a one-one relationship that there should be, is there at least one-many/many-one?)

    The spreadsheet had something like 4,500 entries. I converted it to a .csv so I could read it more directly. When I read it, I get 1050 unique magic numbers. I'm not sure how dangerous it is to ignore the string/varchar version of the magic though (even though I've normalized it):

    public ConditionFlags implements Set<CONDITIONFLAG> { 
    

    .
    .
    .

    public static ConditionFlags fromString(String oldName) {
    ConditionFlags f = new ConditionFlags();
    oldName = oldName.toUpperCase();
    oldName = oldName.replaceAll("AND", "/");
    oldName = oldName.replaceAll("OR", "/");
    oldName = oldName.replaceAll("_", "/");
    oldName = oldName.replaceAll("+", "/");

    while (oldName.contains("//")) {
    oldName = oldName.replaceAll("//", "/");
    }

    for (String s : oldName.split("/") {
    f.add(ConditionFlag.valueOf(s));
    }
    }

    }

    Note that the spreadsheet has entries like Condition1/Condition2 and Condition2_AND_Condition1 ... that is, the order in which they appear isn't standardized.

    Yay.

    I don't want to update all the codez ... I just want to write a wrapper that intercepts the input from the analyzer tools and converts it to the new number. If further tools REQUIRE the old number, then hand that off to them ... somehow ...

    Feels like an all-or-nothing thing to me ...



  • Can you post the different conditions that exists and their value?



  • @gurqnvyljgs said:

    Can you post the different conditions that exists and their value?

    All 17,000 of them! (Make sure you anonymize every single one.)


  • Discourse touched me in a no-no place

    @zelmak said:

    I don't want to update all the codez
    But someone has to sometime down the line...
    @zelmak said:
    I just want to write a wrapper that intercepts the input from the analyzer
    tools and converts it to the new number
    We're back to either a quick sharp shock and change the backend and see what complains then fix them sharpish, or maintaining a filter that is yet another piece of software that will, itself, need maintaining, thus 10* years down the line old software is still using the old numbers.



  • @PJH said:

    ... or maintaining a filter that is yet another piece of software that will, itself, need maintaining, thus 10* years down the line old software is still using the old numbers....
     

    And would effectively be  FOURTH place where the meanings of flags are defined. And will, inevitably, be tweaked to handle various special cases.

    And thus would zelmak join the line of anonymous developers cursed by their successors for creating the whole mess.



  • Oops ... I forgot about the OTHER place the magic numbers are defined ... in C header files for the older analysis code.


Log in to reply