A few months ago I took over maintenance of a custom inhouse content
management system. This was a new J2EE web application designed to
replace a previous PHP-based CMS that was gettting long in the tooth
and difficult to use with the amount of content we have to manage. The
original could have been fixed up, but management decided to start from
scratch so it would be shiny and new.
No, that's not the real WTF.
The new application was, naturally, a disaster; full of many WTFs, most notably the db update process:
- changes to the db were managed by deleting all rows in the table and reinserting them from the user's session cache
- the tables were of course locked for reads and writes during this process
- for some tables the update process would take a considerable
amount of time, considering there were thousands of records to be
reinserted
- during busy periods, several people would be working on the same data
- the db table locks would queue up
- it occassionally took 20-30 minutes for all locks to clear
- people would ask why their change "didn't take"
- many, if not all, of the tables were accessed by several other
inhouse apps, which would appear to hang when unable to access the
affected tables
No, that's not the real WTF.
Anyway, I was parachuted in to try to save this app (like the previous
two applications I've worked on at this company ... I sense a pattern
here) and the original developers "moved on" to other things. We
couldn't afford to move any more staff onto it, so I was a team of one.
At about the same time, we switched to using a new issue tracking
system. We all had a three hour training session set up, but I played
about with the system for a few minutes and decided it was close enough
to clearquest or bugzilla that I didn't need to waste any time on
training, especially considering the thirteen blocking issues I had to
fix.
Fast forward three months. I made a series of minor dot releases to
fix the most egregious server errors and followed up with a major
release with a fully rewritten data access layer and revamped UI,
fixing all 150 Blocking, Critical and Major bugs. A common caching
mechanism meant everyone was working on the same data set, and the RAM
footprint dropped to a quarter (no more OutOfMemoryErrors). Updates to
the db were now handled by actually updating individual records.
Removing drop downs consisting of 20000 items meant that pages that
once took minutes to load now took seconds. Operations that once took 5
hours could now be completed in 30 minutes. The app dropped from being
the top DB load to something that we could deal with.
The content team and db admins were extremely pleased - I got praise
from them and was called out at the most recent engineering all hands.
Last week, my manager called me into his office. "It looks like you only worked on the cms for five hours over the last
quarter." "Huh?" "We know how much you worked but we've set up a three
hour issue tracking system training for you."
You see, one thing people learned during training for the new issue
tracking system was that the engineering budget was partly (thankfully
not totally) based on how much time was spent on each of the issues.
The time was counted from when it went into "assigned" to when it was
"resolved" (plus "reopened" -> "resolved" if needed). I of course didn't know this, and since I was working on
the app on my own, I didn't care about stepping on anyone else's toes.
So I got into the habit of looking at an issue, fixing it in code, then
checking in the change. Then I would set it to "assigned" then straight
to "resolved" (you can't go from "unassigned" straight to "resolved").
So the 150 or so issues I'd fixed spent a total of about five hours in
the assigned state.
So, I have had 17 issues in the "assigned" state for the past week.
No, that's not the real WTF.
For saving the company thousands, if not tens of thousands, I got a $100 bonus.
WTF.