Configuration files are hard



  • Our group's most senior engineer is the smartest guy in the company. Unfortunately, that doesn't mean he can write good code. This is how his application reads its configuration file:

    /* Must re-load config since setup could have changed */
    config = fopen (filename, "r");     /* Check for config file */
    if (config == NULL)
    {
        printf ("\nConfiguration file not found, exiting...\n");
        getchar();
        exit (0);
    }
    fscanf (config, "%s %d %d %d %d %x %x %d %d %d %d %d %s %d %d %d %d %d %d %d %d %x %d %x %x %x %d %d %d",port,&baud,&footype,&ftype,&retrycount,&mask,&rfl,&xyzsize,xyztype,&passcount,&verbo,&portnum,station,&keyenab,&abc_delay,&footype2,&ftype2,&prdydelay,&ppagedelay,&rst_tim,&spdabc,&(unsigned short) lptport,&silence, &abc_read, &resetable, &customid, &obit, &disptype, &hardtype);
    port[5] = '\0';
    fclose (config);
    

    First, I'd like to point out the variables footype and ftype. The 'f' in ftype stands for 'foo'. No one who has ever dealt with this code knows the difference between footype and ftype. Same goes for footype2 and ftype2.

    Since the config file is one long text string of unnamed values, it is too easy to screw up if you try to edit it by hand. So the engineer wrote another program to make it easier to change the values. You can probably guess that there's a massive corresponding fprintf in there.

    His main application actually runs in multiple processes simultaneously (basically he launches his exe multiple times). So you have multiple processes that need this configuration, and a separate process that writes the configuration while the other processes are still running. This has never worked very well: processes running with different configurations, inconsistent configurations, race conditions, etc.

    The engineer's "solution" to this problem was to re-read the configuration file every time he needs any value from the file, hence the comment "Must re-load config since setup could have changed". This block of code was copy-pasted almost two dozen times throughout his code. Need to add a new setting to the configuration file? Search-and-replace every instance of the massive fscanf and hope you don't miss one. At least all of his code is in one file (several thousand lines long).

    The one time I got involved in this mess, I suggested to the engineer that he should at least put all of this in a function so he wouldn't have to search-and-replace. He thought that was a good idea, but he didn't really get what I was saying. Now this block of code only appears 7 times.


  • I survived the hour long Uno hand

    Configuration files are Software Engineering is hard

    FTFY

    (also, holy fuck shit.)



  • Count the wtfs...

    I don't like sscanf. I prefer using our strsplit() helper function or the SkipSpaces/SkipNonSpaces macros when dealing with whitespace-delimited files.



  • At least he didn't use strtok.



  • strtok() is best avoided, as using strtok_r() is little extra effort. I do prefer the strsep() semantics, but that doesn't skip multiple adjacent separators which can be nice sometimes.

    These, of course, have the additional downside of writing to your input buffer so you may need to make a copy (like if you need to show the whole line in an error message later).

    So many ways to solve the same problem.



  • I've always thought it was kind of annoying that there are so many similar-but-slightly-different formats for configuration files. I didn't expect to read about someone who has never seen any of them.


  • Discourse touched me in a no-no place

    @NedFodder said:

    ```c
    fscanf (config, "%s %d %d %d %d %x %x %d %d %d %d %d %s %d %d %d %d %d %d %d %d %x %d %x %x %x %d %d %d",port,&baud,&footype,&ftype,&retrycount,&mask,&rfl,&xyzsize,xyztype,&passcount,&verbo,&portnum,station,&keyenab,&abc_delay,&footype2,&ftype2,&prdydelay,&ppagedelay,&rst_tim,&spdabc,&(unsigned short) lptport,&silence, &abc_read, &resetable, &customid, &obit, &disptype, &hardtype);

    
    http://cdn.meme.am/instances/500x/43316944.jpg
    
    It's not even checking the result of the `fscanf`…


  • I'd rather see him using the should-not-develop-new-things-on-it-now GetPrivateProfile* APIs.

    Btw, seems these * PrivateProfile * APIs are in interesting state. The MSDN documentation just say "These functions are provided only for compatibility with 16-bit versions of Windows. New applications should use the registry." but the functions themselves are neither in deprecated or obsoluted status.

    Btw again, why don't he write code that'll post a "config reload" application message to the child processes to signal the configuration change?



  • @NedFodder said:

    Search-and-replace every instance of the massive fscanf and hope you don't miss one.
    ....

        "%s %d %d %d %d %x %x %d %d %d %d %d %s %d %d %d %d %d %d %d %d %x %d %x %x %x %d %d %d"
    ```</blockquote>
    
    At least it's an easy string to search for...


  • @NedFodder said:

    &(unsigned short) lptport

    I'm struggling to figure out how this compiles.



  • @jnz said:

    @NedFodder said:
    &(unsigned short) lptport

    I'm struggling to figure out how this compiles.

    Maybe it is just a clever way to extract the lower 16 bits of lptport?

    Edit: I had to ponder a few seconds about whether it is more probable that the code is running on an architecture with little endian or big endian integer representation.



  • @Watson said:

    @NedFodder said:
    Search-and-replace every instance of the massive fscanf and hope you don't miss one.
    ....

        "%s %d %d %d %d %x %x %d %d %d %d %d %s %d %d %d %d %d %d %d %d %x %d %x %x %x %d %d %d"
    ```</blockquote>
    
    At least it's an easy string to search for...</blockquote>
    
    If only the first 25 or so settings are needed, why parse the entire file? 
    
    

    fscanf (config, "%s %d %d %d %d %x %x %d %d %d %d %d %s %d %d %d %d %d %d %d %d %x %d %x %x",port,&baud,&footype,&ftype,&retrycount,&mask,&rfl,&xyzsize,xyztype,&passcount,&verbo,&portnum,station,&keyenab,&abc_delay,&footype2,&ftype2,&prdydelay,&ppagedelay,&rst_tim,&spdabc,&(unsigned short) lptport,&silence, &abc_read, &resetable);

    is considerably shorter, after all.
    
    @Dragnslcr <a href="/t/via-quote/54562/6">said</a>:<blockquote>I've always thought it was kind of annoying that there are so many similar-but-slightly-different formats for configuration files.</blockquote>
    
    Right. Everything should be comma-separated XMSON.


  • @jnz said:

    @NedFodder said:
    &(unsigned short) lptport

    I'm struggling to figure out how this compiles.

    Huh, beats me. It's defined as unsigned int. The engineer is using Visual C++ 6 (that was the latest version when this project started). I'm not a compiler expert, but I'm guessing he's in UB territory and just getting lucky.



  • @PWolff said:

    @jnz said:
    @NedFodder said:
    &(unsigned short) lptport

    I'm struggling to figure out how this compiles.

    Maybe it is just a clever way to extract the lower 16 bits of lptport?

    Still curious, so I did a quick search: he casts it to unsigned short everywhere it's used, so he only ever uses the lower 16 bits. So why is it declared unsigned int?



  • I think that violates the strict aliasing rule and is therefore undefined behavior (assuming it's compiled as C++ and not C, dunno what the C rule is).

    Why is it declared as unsigned int? Probably dumbs, maybe an alignment thing?



  • @jmp said:

    assuming it's compiled as C++ and not C

    I'll check when I get back to work tomorrow, but I think it's compiled as C++.

    @jmp said:

    Probably dumbs

    Yes.

    @jmp said:

    maybe an alignment thing?

    No.

    It's just used as the port number to _inp() and _outp(). No reason for any alignment shenanigans.



  • @cheong said:

    Btw again, why don't he write code that'll post a "config reload" application message to the child processes to signal the configuration change?

    He has difficulty with the concept of functions, and you want him to do inter-process communication? Also, they're not child processes:

    @NedFodder said:

    His main application actually runs in multiple processes simultaneously (basically he launches his exe multiple times).

    Imagine him double-clicking the exe as many times as required.



  • @dkf said:

    It's not even checking the result of the fscanf…

    Eh, this is C programming. What could go wrong?



  • @jnz said:

    I'm struggling to figure out how this compiles.

    When you're making soup, you want to add a few dashes of salt and herbs. Casting in C is much the same, add a few here and there to impart flavour, and it makes the code much more memorable for maintainers.



  • Reminds me of my first Perl project in the industry. That was for a company that built PowerPC accelerator cards for Amigas. To bring the system up and be able to load a decent 68k-yo-PPC JIT emulator, they needed something that worked ffrom 256k of flash, never mind the speed. One of the hardware engineers had started to write a program that was supposed to generate the PPC code, in C. He was a brilliant guy for hardware but his code was a wasteland of printfs
    printf("mov.l r28, r%d\n", src1);
    printf("addeo. r28, r%d, r%d\n", src2, src2);
    printf("dibfswcr[0]. 0,1,-1,%s\n", x);
    printf("mov.l r%d, r28\n", src2);

    I convinced him to let me replace the mess with a very small Perl script that soon grew into a monstrosity of itself as new requirements turned up, though not half as bad as the equivalent C would have been.

    [0] I suppose there was some instruction called "do inscrutable bit fiddling stuff with condition register". Or something like that.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.