Configuration Snafu



  • So, we have this huge main configuration file of key/value pairs. Like ... 2K lines or something. Huge.

    Each process entry point is identical to another. The difference comes from the command line options.

    java -classpath (buncha stuff on path) main.ProcessManager (path-to-configuration-file) (module-name)
    

    As each process starts up, it reads the configuration file, finds its configuration bits from the file (entries which are prefixed with (module-name)) and initializes itself. One of the module's configuration bits is the fully qualified class name which actually runs under ProcessManager's supervision and does the actual work for that module. Kind of a plug-in architecture, except that all the code is contained within one JAR file (for non Java-ites, a JAR -- Java Archive -- essentially a collection of compiled Java class files in one file -- treated as a library or executable.)

    Another of the configuration items is the host name on which the process is run. However, since we've only ever run those processes all together on the same machine, it makes no difference. In fact, internally, the code doesn't care about which host it is running when it starts up. If my configuration says google.com, for example, it'll read the value, stash it and start up its processing.

    To stop the processes safely requires use of another program to tell the running process to shut down. It does this by connecting to a TCP port the running process is listening on (chosen by the appropriate entry in the config file) solely for command-and-control purposes (not a huge WTF.) However, the only implemented command is SHUTDOWN (again, only a slight WTF.)

    The tell-the-process-to-shutdown program is part of the same codebase. In fact, is simply another entry point in our main JAR. So, the shutdown program is launched similarly to the process start up:

    java -classpath (buncha stuff on path) main.ProcessStopper (path-to-configuration-file) (module-name)

    As you can guess, it too reads the configuration file, finding all related bits to the named module. Using those bits of configuration, it opens a socket with a connection to (hostname):(port), and tells the process to shutdown using the SHUTDOWN message. Even though it didn't care which machine it was launched on, the message handler in the running process now verifies that the hostname it is running on matches the one in the SHUTDOWN message (because, of course, you could have the same module running on different systems, and you don't want to shutdown the wrong one). Which means, if you fat-finger the hostname on a given configuration, it won't stop. And the Stopper program happily reports that it told the process to SHUTDOWN and exits merrily.

    Also, if our main processing machine breaks down, you MUST change the configuration file to match the new hostname or you'll continue to have problems. Finally, we had a long-existing problem where one of our modules would not shut down when told. Ever. Duplicate modules had no trouble shutting down, so it was obviously not code. Finally, I was given the ticket to figure it out. What did I find? An extra space after the listening port number configuration item for that item. Apparently the process creating the ServerSocket (listener on the running process) allowed it to work fine, but when the SHUTDOWN message came in, it compared its String version with the numeric version that it had parsed, and they didn't match. And, so, that specific module refused to shut down.

    Finally, you can see the fun. If I were to use google.com as my hostname, the Stopper program would dutifully and cheerfully attempt to connect to google.com on the given port and tell it to SHUTDOWN. Ultimately, the connection times out ... and it announces that it connected to google.com on the given port and told it to shut down.



  • Finally, a real wtf, even if it is a minor one!

    I feel your pain :( 



  • That's pretty awesome.



  •  What the hell is it about configurations that so many teams find it so hard to do right?



  • @zelmak said:

    connect to google.com on the given port and tell it to SHUTDOWN

    OMG! Be careful! Don't shut down google!



  • So that's what happened to Google last week.



  • @snoofle said:

     What the hell is it about configurations that so many teams find it so hard to do right?

     

    Same as crap code: strangely short-sighted mental processes unable to project a situation further into the future than 3-4 hours, thus automatically making it impossible to conceive of the fact that in the future, details will be forgotten but things still need to be maintained.

    A good fix for that, I think is making things for yourself that you'll use reguarly; instead of one-off code experiments. Your Euler project that you do on a whim doesn't need to be commented, but that tool your wrote will need some extra features a month down the road.

     



  • @zelmak said:

    Another of the configuration items is the host name on which the process is run.
     

    Can't some library interrogate the OS and obtain this information naturally? Or are we talking about another process running on a remote machine for which this process needs to connect to?

    @zelmak said:

    What did I find? An extra space after the listening port number configuration item for that item

    TRWTF here isn't the config file, it's the parsing of the values and preserving trailing whitespace after what should be a numeric value. Which you stated, anyway.

    As a matter of interest, are there any procedures to dump config info out of your code? I know with many systems I use, logfiles normally capture reported startup parameters so I can (a) determine I'm using the right config file and (b) prove my config changes are recognised. I'd have thought any server-side code would contain some module for config diagnosis.



  • @Cassidy said:

    TRWTF here isn't the config file, it's the parsing of the values and preserving trailing whitespace after what should be a numeric value.

    Yes.

    @Cassidy said:

    As a matter of interest, are there any procedures to dump config info out of your code? I know with many systems I use, logfiles normally capture reported startup parameters so I can (a) determine I'm using the right config file and (b) prove my config changes are recognised. I'd have thought any server-side code would contain some module for config diagnosis.

    Yeah, same thing. And when I log the values of variables, I always wrap in single quotes so whitespace is instantly visible.



  • @Cassidy said:

    @zelmak said:

    Another of the configuration items is the host name on which the process is run.
     

    Can't some library interrogate the OS and obtain this information naturally? Or are we talking about another process running on a remote machine for which this process needs to connect to?

    I'm pretty sure (based on vestigal code and configuration) that the original idea was that the various modules COULD run on different physical/logical machines. That a central point of control would know what processes were supposed to be running on what machines so they could interrogate them for status, tell them to shutdown and the like. However, the status message, while implemented, doesn't actually return any value (indeed, I don't remember precisely, but I don't think the message service provided for a return value back to the caller :sigh:) There is vestigal code for some sort of central manager (watching the processes run) but the external infrastructure (JINI-based, JavaSpaces?) no longer runs/works/etc., and since all the modules run on one system, no one cares anymore.

    To answer your question, yes, you can do a 'getHostByName()' (and that's what the code does) but the only time it cares about that information appears to be when the shutdown message comes. You would think that the process, once it read its configuration, would check to see if it were running on the correct machine. If it was the right machine, carry on normally. Otherwise, it would pop an exception (don't EVEN get me started on exception handling) and exit however gracefully it could.

    But it doesn't.

    Even more fun, if you forget to give it a port number to listen on, it chooses a default ... a constant. If that port happens to already be in-use, it throws an exception, the message services fails ... and the module continues running. Now I can't connect to it and shut it down if I wanted to since the message handler isn't running. 'kill -9' it is!

    @Cassidy said:

    @zelmak said:
    What did I find? An extra space after the listening port number configuration item for that item

    TRWTF here isn't the config file, it's the parsing of the values and preserving trailing whitespace after what should be a numeric value. Which you stated, anyway.

    As a matter of interest, are there any procedures to dump config info out of your code? I know with many systems I use, logfiles normally capture reported startup parameters so I can (a) determine I'm using the right config file and (b) prove my config changes are recognised. I'd have thought any server-side code would contain some module for config diagnosis.

    Re: (a) and (b) it could, but doesn't. Logging is random as hell, is mostly done with System.out.println() and captured stderr and stdout in command-line chosen logfiles. Sadly, the configuration file uses fully-qualified, absolute paths to every referenced file, so if you make a change to the root of the logging directory, for example, every module's configuration would need to change. In fact, there is SOME java.util.logging going on, but it is sporadic and only logs specific events; such as when an input file has completed processing. These logs are in /prod/<app>/logs, while the stdout/stderr logs are in /prod/logs/<app>.

    Meanwhile, this configuration file references directories in which other configurations are opened by specific modules which, in-turn, can and do reference further configuration files and fully-qualified, absolute paths for input and output directories ...

    This app is sooooo not portable, I don't even want to discuss it ... well, right now anyhow.



  • @morbiuswilters said:

    @Cassidy said:
    TRWTF here isn't the config file, it's the parsing of the values and preserving trailing whitespace after what should be a numeric value.
    Yes.

    All configurations are parsed by java.util.Properties. Where every key is a String, and every value ... is a String. So, somehow, trailing spaces are kept.

    Then again, the original programmers wrapped java.util.Properties (to give it getInt, getLong, getFloat, etc... which are rarely used ...), maybe they screwed something up.


     



  • @zelmak said:

    All configurations are parsed by java.util.Properties. Where every key is a String, and every value ... is a String. So, somehow, trailing spaces are kept.
     

    .. only because nobody thought to canonicalise the leading/trailing whitespace away.

    Hell, I've done this in PHP routines because I inherently don't trust external input[1]. To me, configuration files fall into that category.

    [1] <input type="text"> etc...



  • @Cassidy said:

    @zelmak said:

    All configurations are parsed by java.util.Properties. Where every key is a String, and every value ... is a String. So, somehow, trailing spaces are kept.
     

    .. only because nobody thought to canonicalise the leading/trailing whitespace away.

    Hell, I've done this in PHP routines because I inherently don't trust external input[1]. To me, configuration files fall into that category.

    [1] <input type="text"> etc...

    Don't shoot the messenger! :)

    You realize that (a) I'm a contractor (much life snoofle) working on (b) a government system that's (c) over 10 years old. So I have little to no say in How Things Have Become. I try to nudge and sway and cajole things into the Right* direction, but I don't always succeed. Honestly, the app needs a huge overhaul (refactoring) but since (a) it comprises over 700 java classes and (b) 144,000+ lines and (c) the Next Great Thing is just around the corner**, they don't want me working on that.

    * - In my ever so humble opinion.
    ** - Which they've been saying since I got here over three years ago. I'm told they've been saying for up to three years before that, as well.



  • @zelmak said:

    Don't shoot the messenger! :)

    Yeah, sorry - I wasn't having a go at you. I understand you were explain how it got like that - I was more clarifying why it's stayed like that.

    @zelmak said:

    You realize that (a) I'm a contractor (much life snoofle)

    No, I didn't. But if you're in a similar position to snoofle - then you're falling behind in your WTF posts.

    Kindly rectify this at once! Given what you're working on at the moment, I'm willing to bet you've plenty of material to draw upon - I expect you to match his frequency pretty soon, no excuses. 



  • @Cassidy said:

    No, I didn't. But if you're in a similar position to snoofle - then you're falling behind in your WTF posts.

    I think Snoofie and zelmak account for 90% of the non-lolcat OPs around here.



  • Speaking of configuration, I just counted and:

    269 configuration files, comprising 13680 lines of text (could be comments, or empty lines, but still.)

    And these are the first- and second-level (nested) files that I know about.



  • @morbiuswilters said:

    I think Snoofie and zelmak account for 90% of the non-lolcat OPs around here
     

    They do? Perhaps I'm only registering snoof's over zelmak's solely based upon avatar.

    Same way that I've overlooked many of yours because you've ripped off someone else's Nick Cage avatar...



  • @Cassidy said:

    Perhaps I'm only registering snoof's over zelmak's solely based upon avatar.

    Probably. People with the plain avatar don't count as humans in my book. It's hard to empathize with them; I think in my soul I just assume they are some creation of CS.

    @Cassidy said:

    Same way that I've overlooked many of yours because you've ripped off someone else's Nick Cage avatar...

    gasp That's un-possible! I account for nearly 90% of the flamebait around here.



  • @morbiuswilters said:

    @Cassidy said:
    Perhaps I'm only registering snoof's over zelmak's solely based upon avatar.
    Probably. People with the plain avatar don't count as humans in my book. It's hard to empathize with them; I think in my soul I just assume they are some creation of CS.
    What about me then?  My avatar is an inanimate object.



  • @Anketam said:

    What about me then?  My avatar is an inanimate object.
     

    Magical objects are people too.



  • @morbiuswilters said:

    Probably. People with the plain avatar don't count as humans in my book. It's hard to empathize with them; I think in my soul I just assume they are some creation of CS.
     

    I just think it's more fun and more practical if people used proper avatars.



  • @dhromed said:

    @morbiuswilters said:

    Probably. People with the plain avatar don't count as humans in my book. It's hard to empathize with them; I think in my soul I just assume they are some creation of CS.
     

    I just think it's more fun and more practical if people used proper avatars.

    I tried to update my avatar with an animated gif ... apparently, CS either (1) converts it to another, non-animated format or (2) takes the first frame and strips the rest because (c) its no longer animated.

     



  • @dhromed said:

    proper avatars

    Explain.



  • @zelmak said:

    CS either (1) converts it to another, non-animated format
     

    Jpeg.

    @zelmak said:

    (c) its no longer animated.

    I see what you did there.

     



  • @Xyro said:

    @dhromed said:
    proper avatars
    Explain.
     

    Which part?



  • @zelmak said:

    I tried to update my avatar with an animated gif ... apparently, CS either (1) converts it to another, non-animated format or (2) takes the first frame and strips the rest because (c) its no longer animated.
    It converts everything to JPEG, even when that increases size by a lot.



  • @dhromed said:

    @Xyro said:
    @dhromed said:
    proper avatars

    Explain.
    Which part?

    @dhromed said:
    proper



  • @Xyro said:

    @dhromed said:
    @Xyro said:
    @dhromed said:
    proper avatars
    Explain.
    Which part?
    @dhromed said:
    proper
     

    No smegma.


Log in to reply