So as I sit here avoiding work (I have to reverse engineer a no longer used communication protocol and then extract a mangled binary image from it), I remembered a great story from my last job.
We sold commercial off the shelf products to the military, with our newest and shiniest product being a near bullet proof unit that was internally electronic but appeared mechanical to the user. I was responsible for maintaining the device firmware that was written by the electrical engineer who made the board. It was a bit weird in places, but overall it was small, simple and most importantly, sane. In any case, said unit was having a problem where it would randomly lose power. The president of the company finally got reamed for it by the customer and called an emergency meeting with engineering to figure out what could possibly cause the issue.
Now as a quick aside to help you understand how the president, an engineer, could make as stupid a decision as I'm about to tell you about, understand that his only time with software involved either FORTRAN in college or a VB.NET customer facing application he had written when the company had no formal software engineers. Morbid curiosity and a need to fix a bug once led me to look at the code, revealing over 10 THOUSAND lines of code in the GUI module, which had less than 20 controls. Needless to say, that project left him believing that all software was buggy and evil, and was the root cause of all problems our products had. Now back to our story...
There were quite a few reasons put forward: the unit could be heavily jarred and the batteries could separate from the contacts (why that was physically possible was never explained to me in a rational way when I asked), and the unit power on sequence involved a mechanical switch providing VBat to the MCU long enough for it to boot and set a pin to a MOSFET high, supplying current through the MOSFET instead, any noise in that maze of a circuit would cause it to black out. He thought they were "good ideas but probably impossible", and pressed me to provide details of something that could go wrong with the firmware. We had done a line by line review of the code a year earlier and found at worst a few unused variables, but he wanted a theory so I threw out that there might be a null pointer being dereferenced and since the MCU is responsible for keeping itself powered, a reset would cause a blackout.
Looking back, mentioning "pointer" to someone that thought that
VB.NET was incredibly error prone probably wasn't wise, "That's it! I just
know the code's got to be going off into the weeds and mucking
everything up! That's just got to be it." I tried to point out that
that wasn't at all what I had just said, but he was convinced, by himself, that
it was the problem. He demanded to know how to fix a problem like that if it
happened, and I said in theory that if it did that the only way to fix it would
be to reset the processor, which leaves us back at square one unless the user
was holding the mechanical on switch at the time.
Strike two. "Then what if when they press the switch, the first time it
powers on it sets a flag, resets the processor and then actually boots the
second time around?" I argued that the only time that code would execute
was when the unit was working sanely to begin with, not to mention that the
unit had only been blacking out in the middle of operation. He seemed to accept
that argument but came back with the be all end all response, "Well, we
have to do something."
So to our servicemen out there, wondering why their unit takes a stupidly
long time before it responds to the on button, I can only say I'm sorry.
But at least we did something.