Why we get deadlocks
-
In the bowels of our main class (pseudo):
while (true) {
// read message from queue
try {
// create db lock for records to be processed
} catch (Exception e) {
}
...
if (thisIsAProductionRun()) {
// process the data
// 1000 lines of intervening code
if (success) {
// release the db locks
}
}
}And then they wonder why every day or so they need to bounce the cluster.
-
That's a resource leak. It's not a memory leak, but it is a resource leak.
-
@snoofle said:
while (true) {
// read message from queue
try {
try {
// create db lock for records to be processed
} catch (Exception e) {
// retry to create the db lock
}
} catch (Exception e) {
}...
There, fix'd.
-
Like.
Are they at least consequent enough to refuse letting you fix it?
-
@Ilya Ehrenburg said:
They originally didn't want to risk rocking-the-boat, but after a flurry of db reboots during busy-time, they relented. I fixed and tested it in about an hour and they're stress testing it now. I suggested that in addition to running the good cases, that they also attempt to create a deadlock with a few forced failures by planting bad data (they're going to have a meeting to discuss IF it's appropriate). They would like to deploy by Mar 31 next year, if we can get it done with confidence by then. Mmmm-K.Like.
Are they at least consequent enough to refuse letting you fix it?
-
@snoofle said:
... They would like to deploy by Mar 31 next year, if we can get it done with confidence by then. Mmmm-K.
No rush I see.
-
Blah, rebooting a server take what? 10, 20 minutes? not 5% of office hours. If they can't work without a working database, where is the world going?
-
@gobes said:
It's not so much the reboot; that happens in under 2 minutes. It's the reloading of the master caches that takes 2 hours (and yes, that too is a WTF).Blah, rebooting a server take what? 10, 20 minutes? not 5% of office hours. If they can't work without a working database, where is the world going?
-
@gobes said:
Blah, rebooting a server take what? 10, 20 minutes? not 5% of office hours. If they can't work without a working database, where is the world going?
Thank you for self identifying as TRWTF.