Just shoot me now
We have a db that contains information that is of interest to many other teams. Up until now, they have all coded SQL into their applications, which precludes us from changing the db without a painful simultaneous upgrade.
Since our schema needs a major rework, we decided it best that we provide an api to wrap the db, so that the other teams can access the db via the api, and we are free to change the implementation under the hood as needed.
Ok, so I'm building and testing the api. We only have one messaging server for our dev environment, so it doesn't matter whether I run the api-request-server on my pc, or our dev box; it will always access the singleton message queue. I was playing with it on the dev server, and noticed something that needed tweaking. I thought that I had shut it down on the dev box, and began to run both the client and api-request-server on my pc.
Requests were being honored for the client, but there were no logs for random requests on the the api-server. WTF?
I figured that I had accidentally dropped a line of code when deleting debugs, and spent hours stepping through both the client and server, finally putting debugs immediately before sending a message, and immediately after receiving a message at both ends, and it turns out that some requests were never reaching the api-server even though they were being executed somewhere.
It finally dawned on me that the api-server was still running on the dev box; it was randomly (time slicing) grabbing requests off the queue, and honoring them.
We've all probably done that at least once. I did it a few weeks ago. We have 6 (or 9, depending on how you count) production servers and 1 dev server. We deploy code packages from dev to one of the production servers where they are executed by a runtime application. My project was a simple integration to transform and move data from a database owned by one business entity to a db owned by another. I failed to check the other 5 production servers before deploying to my target one to see if an older version of this package was running on any of them. Once it went into production, we noticed that very occasionally some data wouldn't get processed correctly. The new package was scheduled to run every 5 minutes and the older package (under a different name) was scheduled to run every half hour. Data processed by the older application usually wouldn't reach its destination.
It was a real headache to track down. The different (but vaguely similar) package name is what threw me off.