What's your biggest screw up?



  • I once forgot the "." when editing the reverse DNS file.  Took down the entire company (an ISP by the way) for about an hour until everything got straightened out again.  As soon as I realized it it was a true "ohnosecond".

    I bought alot of beers for everyone that night as my "punishment". 



  • @smbell said:

    Years ago I was 'the tech' for a small computer store.  As you might guess my job was to build and repair computers.  A customer came in for a new processor.  They bought the processor and had me install it.  The install went fine and, being the good tech I was, I noticed their firmware was out of date so I grabbed the latest firmware for their motherboard.  I made the boot floppy and started the firmware update.  A message popped up that basically said 'Warning this firmware does not match this model'.  Normally this would have sent screaming red flags off in my head, but I was tired and it was closing time so hit hit 'continue anyway'.

    The customers motherboard (which they did not purchase from us) was now dead.  Worse, the BIOS was not the removable type so there was no way to resurrect the board.  It's now after closing time, it's just me, the customer, and the store manager as I tell them the firmware update failed and the board was toast.  I spent the next two hours getting their computer going with a new motherboard, that we of course didn't charge them for, and getting them back to a working computer.

    Two weeks later I called in to let the manager know I was going to be 15 minutes late getting to work.  I was met at the door with a 3 day suspension.  The day I got back from my suspension I was fired.  :(

    The big WTF here is that they fired you for that. O_o

    Back when I was still in high school I was working in a similar position (a tech at a mom & pop computer shop, a three-man operation including me).  One day (this was when 1GHz processors were the brand new thing on the block), we got our first order for a 1GHz Athlon machine for a customer build, so we started work on it one afternoon.  After getting the CPU and motherboard seated, it was closing time so we left it there and went home.  The next morning, the boss was in to do some paperwork (he usually just came in for that; me and the other guy pretty much ran the place), and he says "Oh, cool!  A gigahertz processor?  I gotta see how fast this goes!" and fires up the machine, despite there being no CPU fan or heatsink on the processor.  Exactly six seconds later, a small popping noise can be heard (like a piece of bubble wrap being stepped on) and a little spiral of smoke wafts up from the (then quite expensive) processor.

     And that was the owner of the company.  I can't believe your manager didn't just shrug it off and tell you "don't worry about it, that happens to everyone once.  But do it again, and it's coming out of your pocket" :P



  • The year was 2000, sometime during the late summer.

    The system was a Slackware Linux server.  It had a 13GB and a 20GB hard drive in it.

    What was being attempted was a re-arrangement of the partitions.  Yes, this is kind of a delicate operation.  However, there were less than 13GB of data on the whole system at the moment, so the "obvious" thing to do was to move all of the data from the 13GB drive to the 20, re-partition the 13, then store it on the 13 while repartitioning the 20, etc. 

    Well, I thought, since I wasn't going to use the data until after I was done, it would be okay to store them in an archive file.  With /home being on the 20GB drive, and /space being one huge partition on the 13, I did this:

    tar -cvzf /space/home.tar /home

    I walked away from it for a while.  When I cam back, it was done, and I proceeded to repartition the 20.  I then went to unpack the tar.gz file from the 13, and had a rude reminder about how the maximum file size then supported was 2.1GB.  The file had compressed about 75% on average, so I was able to get back about 8GB of the content, but not all of it.


     



  • My biggest screw up is probably the day I spent about 3 hours searching for the bug in my code and finally gave up. The next day I caught it almost instantly (java):

    for(int i = 0; i>whatever; i++);
    {
    	//do something (array work I think and file reading)
    }

     

    Goes to show, step back.



  • @malfist said:

    My biggest screw up is probably the day I spent about 3 hours searching for the bug in my code and finally gave up. The next day I caught it almost instantly (java):

    for(int i = 0; i>whatever; i++);
    {
    //do something (array work I think and file reading)
    }

    Goes to show, step back.


    One thing I really like about GCC is that it has warnings above and beyond what other compilers do (type checking on printf, semicolons after "for" statements, assignment within "if" statements).  One thing I really hate about GCC is that you often can't selectively turn them on or off.



  • My mother was doing a typing course , this was back in the 80s. Usually they let her take one of the IBM PCs (no hard disk) home for the weekend, for practising on. Also, the course only had one DOS floppy at the time (WTF?). Having an inquisitive nature, I noticed all of these commands on the DOS disk and decided I had to try them all out, in alphabetical order. I never reached 'g'... Also, the look on my mother's face when I asked her, "What does 'formatting' mean?" was priceless.

    Also, one time my aunty got a brand spanking new Compaq Presario with the latest and greatest Windows 3.11. I decided to give them some games for free. However, soon after playing the game in a window for the first time, the system crashed, and upon rebooting, the entire harddrive was full of hoseness (about half of the files were corrupt and unreadable). They had to take it back to the shop and get a fresh install. I still blame the tools (it turned out the system came with the entire HDD being "DBLSPACE"'d, and I presume the MS driver for that does not like it when you run a DOS4GW game).


     



  • I've done something similar with chown. Wanting to include all hidden files I did

     chown -R user.group .*

    which includes '..', of course... that's the day they took the root password away from me.
     



  • @malfist said:

    My biggest screw up is probably the day I spent about 3 hours searching for the bug in my code and finally gave up. The next day I caught it almost instantly (java):

    for(int i = 0; i<tspan style="background-color: rgb(255, 255, 0);">></tspan>whatever; i++)<tspan style="background-color: rgb(255, 255, 0);">;</tspan>
    {
    //do something (array work I think and file reading)
    }

    Goes to show, step back.

    Mm, tasty, double the bug.

    Edit: WTF, in the editor the bugs are highlighted in yellow, but not when I post. At the risk of being cliché, the Real WTF is the forum software. The two bugs can be left as an exercise to the reader in that case.



  • @Bob Janova said:

    @malfist said:

    My biggest screw up is probably the day I spent about 3 hours searching for the bug in my code and finally gave up. The next day I caught it almost instantly (java):

    for(int i = 0; i<TSPAN style="background-color: rgb(255, 255, 0);">></TSPAN>whatever; i++)<TSPAN style="background-color: rgb(255, 255, 0);">;</TSPAN>
    {
    //do something (array work I think and file reading)
    }

    Goes to show, step back.

    Mm, tasty, double the bug.

    Edit: WTF, in the editor the bugs are highlighted in yellow, but not when I post. At the risk of being cliché, the Real WTF is the forum software. The two bugs can be left as an exercise to the reader in that case.


    The second bug is only a bug if "whatever" is not changed inside the loop.


  • My first job as an actual programmer. 

    I worked at a place where there was exactly one computer that was setup to build all of our code. This infinitely precious machine was not, as some would assume, locked away in a server room with a RAID and UPS. It was just a tower PC sitting in someone's cube. One night, the person who normally operated this computer had to leave early, and I was staying late, so I was asked to shut it down before I left. It had been announced that the power was going to be shut off that night for maintenance, so it was important that this PC be shut down gracefully lest a hard shutdown from the power outage corrupt its filesystem.

    It turned out that the magic build machine was on a KVM switch with another machine. I began the shutdown sequence of the person's other PC, since it was the one that happened to be selected on the KVM. When it got to the "it is now safe to shut down the computer" screen (yes this was way before Windows computers could turn themselves off), I reached down and hit the power switch of what I thought was the computer I had just shut down.

    The screen didn't change.

    I had just turned off the build machine, while it was in the middle of doing God Knows What,

     I switched the KVM over and turned it back on. BSOD.

    No one was ever able to revive the machine. The company had to figure out how to build its code all over again.
     



  • @RayS said:

    @smbell said:

    Two weeks later I called in to let the manager know I was going to be 15 minutes late getting to work.  I was met at the door with a 3 day suspension.  The day I got back from my suspension I was fired.  :(

    Umm wow that's a bit harsh for a single inexpensive mistake while trying to help out a customer (unless you made more than a few other WTFs that you didn't mention)!

     

    That is fucking harsh.  A motherboard costs all of $150. Unless you were a bad worker or made quite a few other mistakes he spent a LOT more than that finding and training a replacement.  What an idiot. 



  • @Licky Lindsay said:

    My first job as an actual programmer. 

    I worked at a place where there was exactly one computer that was setup to build all of our code. This infinitely precious machine was not, as some would assume, locked away in a server room with a RAID and UPS. It was just a tower PC sitting in someone's cube. One night, the person who normally operated this computer had to leave early, and I was staying late, so I was asked to shut it down before I left. It had been announced that the power was going to be shut off that night for maintenance, so it was important that this PC be shut down gracefully lest a hard shutdown from the power outage corrupt its filesystem.

    It turned out that the magic build machine was on a KVM switch with another machine. I began the shutdown sequence of the person's other PC, since it was the one that happened to be selected on the KVM. When it got to the "it is now safe to shut down the computer" screen (yes this was way before Windows computers could turn themselves off), I reached down and hit the power switch of what I thought was the computer I had just shut down.

    The screen didn't change.

    I had just turned off the build machine, while it was in the middle of doing God Knows What,

     I switched the KVM over and turned it back on. BSOD.

    No one was ever able to revive the machine. The company had to figure out how to build its code all over again.
     

     

    This isn't your WTF.  THey should have stored the build scripts on backed up network storage.  

     

    Rule # 1.  Anything that is stored on 1 disk will be lost eventually. 



  • @tster said:

    @RayS said:

    @smbell said:

    Two weeks later I called in to let the manager know I was going to be 15 minutes late getting to work.  I was met at the door with a 3 day suspension.  The day I got back from my suspension I was fired.  :(

    Umm wow that's a bit harsh for a single inexpensive mistake while trying to help out a customer (unless you made more than a few other WTFs that you didn't mention)!

     

    That is fucking harsh.  A motherboard costs all of $150. Unless you were a bad worker or made quite a few other mistakes he spent a LOT more than that finding and training a replacement.  What an idiot. 

    The existing manager was the girlfriend of the previous manager (who had moved up the company, it was a chain of small computer shops).  We had other 'personality conflicts' (meaning I thought she was a flipping idiot, and I didn't kiss her a$$).



  • Bearing in mind that I'm a developer, not a sysadmin...

    I work for quite a small company, and we don't have a dedicated system administrator. The work is shared out between a few of us, as there isn't much day-to-day work to be done. I tend to look after the *nix systems, but sometimes I have to do work on our Windows boxes as well.

    This particular time, I was doing some work on our Exchange server. Somehow I managed to configure it to allow open relay... ooops. I didn't notice anything until one Monday I got into work, and the Exchange server was down. This isn't anything unusual (we have disk space issues on that machine, but that is another WTF in itself). I logged in to the system manager and saw that it was working through a backlog of about 10,000 spam emails! Doh!

    Suffice it to say, I am never making that mistake again.


     



  • How about the opposite -- a great save?

    I worked for a very large bank a long time ago (number 1 or 2 in the US).  They wanted to test a new version of TPF in their production environment -- you know, Transaction Processing Facility, the mainframe software that can handle 8000 transactions per second on a big enough mainframe.  TPF ran their ATMs and routed Visa transactions from merchants to the right banks.  TPF grew out of ACP, the Airline Control Program, which sells airline seats.  This TPF release had worked well in their test environment but they wanted to test some large simulated loads with real data.

    I suggested that the test be done on a VM system (Virtual Machine, like the Virtual Server for PCs) that we in the mainframe test department would configure to match the disk setup of the production system. 

    So I built such a VM system configured to run TPF, and to read the existing data from the production system's disks (about 500 disks) and write all changes to a bunch of spare disks.   My boss and I carried it over to the bank's production data center on a tape.  We restored that tape to a spare disk on the production system, shut down the ATMs and the Visa routers, and brought this VM system up on the production mainframe. 

    After testing for a while, all of a sudden the system "wrote" all over the "production" system's disks and completely screwed everything up.  Luckily, the changes were actually written to the spares.

    The computer operators and managers looked at me and said "How do we recover from this?".  I calmly said "Just shut down the VM system, bring up the production system instead of the test system, and all the production data will be untouched". 

    They were amazed, and from then on, all new releases on the production system were first tested under VM.

    They could have recovered the production data from backup tapes, given about 16 hours, with all of their ATMs and the Visa switching system down for that period of time.



  • Mine was yesterday. (And a "thank the saints" moment, to boot).

    Was testing out the new software upgrade, and working through a feature that we haven't used in the past - the ability to divide your world up into regions (so you can assign resources and customers and such, and work with smaller problems). The "previous administration" put us in one region called "TEST" (Seven years later, and we're still in TEST). So, I add two new regions, remove the original, click past the standard "you're deleting something" prompt.

     And then find out that the region is the primary key for all the customers. And all the delivery information. And... pretty much *everything*. I suddenly have a very, very empty database.

    The good news is that I had convinced the management that we really, *really* should have a test environment up for this sort of thing. But a year ago, that would have been the 24/7 "must be online" production system, and if I was really lucky, the backups happened the week before...

     



  • @Saladin said:

    Back when I was still in high school I was working in a similar position (a tech at a mom & pop computer shop, a three-man operation including me).  One day (this was when 1GHz processors were the brand new thing on the block), we got our first order for a 1GHz Athlon machine for a customer build, so we started work on it one afternoon.  After getting the CPU and motherboard seated, it was closing time so we left it there and went home.  The next morning, the boss was in to do some paperwork (he usually just came in for that; me and the other guy pretty much ran the place), and he says "Oh, cool!  A gigahertz processor?  I gotta see how fast this goes!" and fires up the machine, despite there being no CPU fan or heatsink on the processor.  Exactly six seconds later, a small popping noise can be heard (like a piece of bubble wrap being stepped on) and a little spiral of smoke wafts up from the (then quite expensive) processor.

     

    How's this: I was working on the first computer I ever owned (~1999ish). In order to get access to some drives I took the heatsink off. While rotating the case I managed to jiggle a USB cable which was plugged in. At this point I learned three things.

    1) Never work on a computer with the power cable plugged in

    2) A USB device can boot a system

    3) Never do either of the above with your thumb on the processor core

     

     


    Yeah, it hurt. 



  • Let's see...

    As a senior software developer, I'm asked to build a system to support the "grading" of all 50,000 employees for massive lay-offs...  I'm sorry - Redeployments to non-existent jobs.   30+ different custom forms.  All the HR data on everyone.  50,000 people nervously awaiting my work.  2 weeks to complete everything.  By myself, to minimize the collusion.

    Day 3 after the launch, running a 103 degree fever, but still in the office to keep an eye on things...   Between fits of vomiting, some one suggests it would be nice to normalize the collected data based on who submitted it. SQL 7 DB.  I'll just right click and update the table to add a column.   No problem.

    Day 4.  My manager is in my office when I arrive.  His manager is also in my office.  Her manager is on my speaker phone.

    Seems the enterprise manager update table didn't move the data first.  Just drops and re-adds the table with the additional collumn.  DOH!

     No problem.  I have daily back ups and hourly transaction logs.  How bad could it be...  Just restore 2 back ups, run some transaction logs....

     Guess what, the db is in Simple recovery mode.  Nothing in the transaction logs.

      Still running a 102 degree fever.  3 levels of management watching.  Restore back up from Day 3.  Restore back up from today.  Merge 2 databases back together.  Admit complete ignorance as to any changes that occurred in the missing 12 hour window.  Write groveling note to all 50,000 employees who now must go back into the system to verify their employment options.  Instill great confidence in same 50k employees that I'm not #$@ them over with other bugs...

      The only upside of the whole adventure is that business recovered before we finished printing up pink slips.  Of the 4k folks slotted to go, we only ended up dropping 14 true misfits.



  • When i was younger i worked at burger king usually in the drivethrough counter. A customer made me extremely furious through the intercom and i repeatedly slammed my fists on the cash drawer, which happend to reset the Windows Based thin client that we took orders through and produced a BSOD, effectivly shutting down the Drive Through at peak lunch rush.

     

    ...I got to take orders by hand...

    ...With a very POed Store manager bagging my orders...

     
    fun.
     


Log in to reply