ReDUMBdant UPS



  • Ok, we have been having power outages here lately in my building that supposedly our IT staff has prepared for.  Here is what usually happens.... 

    1. Power goes out
    2. Everyone without a UPS curses
    3. The few with a UPS laugh... (Including me).
    4. Servers stay running because they are on a UPS too.
    5. Those with a UPS keep working everyone else goes out for lunch/coffee, etc. 

    Here is where the fun starts...The ups in the server room starts freaking out within minutes of the blackout.  Our one IT staff member at this location, is of course out sick, etc.  Typically I go in and see that 'oh crap' the UPS is getting drained fast.  I frantically start typing on the console to start shutting the servers down.  I rarely get one shutdown when everything just dies.

     DAMN IT!!  3 hours later power comes back on and usually at least one server has lost its drive controller...  Typically it is our source control box or our dev box that losses some hardware.  2 days later everything is back to "normal".  Next week....wash, rinse, repeat.

    After the first time it happened, I told our local IT 'guru' Bob, that we had to properly manage what was on the ups because it just gets sucked dry too fast.  I told him there were a couple servers that we used for legacy code that really didn't need to be on the UPS.  Bob of course blows me off saying that there is no way the ups only lasted a few minutes.  Second time it happens...same thing.  On the third time, our local guru's boss Jim is in town and is actually a pretty bright guy.  This time our dev box went down so the developers were screwed for a while until they got the hardware fixed. 

     After talking with Jim about what was happening and our needs he states 2 citical things:

    First WTF:  We have three $2500 upss sitting at our other location unused. 

    Second WTF:  Our servers have redundant power supplies in case one dies.  Each one can power the server by itself.  Our local genius of course plugged both power supplies on every server into the same UPS. 

     In summary, we have probably lost a weeks worth of development for nine developers in the last couple months alone the we could have avoided through using equipment we already had and properly setting up the existing equipment. 



  • Number 2 is not that big a WTF.

    When only one PSU is powered it will take about twice the power that each take when both are powered. Any differences are down to inefficiencies in the PSU, which change with load. This could actually make it worse to run on a single PSU!

    If you assume 92% efficient at 50% load and 86% efficient at 100% (80 Plus 'Gold' standard), and the server is drawing enough power to 100% load a single PSU, and assuming that this is 400W:

    * Single PSU running, efficiency = 86%, power out = 400W, so power in = 465.1W

    * Both PSUs running, efficiency = 92%, power out = 400W, so power in = 434.8W



  • I thought about that. I should have clarified it and said the each PSU on a server should go to a seperate UPS.



  • @NCBloodhound said:

    Ok, we have been having power outages here lately in my building that supposedly our IT staff has prepared for.  Here is what usually happens.... 

    1. Power goes out
    2. Everyone without a UPS curses
    3. The few with a UPS laugh... (Including me).
    4. Servers stay running because they are on a UPS too.
    5. Those with a UPS keep working everyone else goes out for lunch/coffee, etc. 

    You missed the last step. Those without a UPS enjoy their coffee and laugh at those with a UPS.



  • @NCBloodhound said:

    I thought about that. I should have clarified it and said the each PSU on a server should go to a seperate UPS.

    Ah, yes - now it makes sense as a WTF


  • @SuperousOxide said:

    @NCBloodhound said:

    Ok, we have been having power outages here lately in my building that supposedly our IT staff has prepared for.  Here is what usually happens.... 

    1. Power goes out
    2. Everyone without a UPS curses
    3. The few with a UPS laugh... (Including me).
    4. Servers stay running because they are on a UPS too.
    5. Those with a UPS keep working everyone else goes out for lunch/coffee, etc. 

    You missed the last step. Those without a UPS enjoy their coffee and laugh at those with a UPS.

     

    That's what I was thinking.  As one of my coworkers says, "if you're going to shit, shit on the clock."   



  • @SuperousOxide said:

    That's what I was thinking.  As one of my coworkers says, "if you're going to shit, shit on the clock."   

    Deadlines are deadlines though.  They still had to get their stuff done.  :)



  • @Digitalbath said:

    As one of my coworkers says, "if you're going to shit, shit on the clock."

    I am intrigued by your co-worker and his bizarre scatological behavior.  Does her preferring shitting on analog or digital clocks?  Or is there really no difference?  I would think you'd avoid shitting on a clock that was plugged into AC power as a particularly wet defecation could result in electrocution.  Does the clock have to be keeping time or can the hands remain motionless?  Does it have to be set to the correct time?  What does he do when he is in public and there are no available clocks, does he just hold it or will a wristwatch suffice?  Can he clean the clock/wristwatch after shitting on it?



  • @SuperousOxide said:

    You missed the last step. Those without a UPS enjoy their coffee and laugh at those with a UPS.
    Where I used to work, it was more like "Guy without UPS that had not saved his 5-hour work of his life jumps out the window" or something like that. Oh, and those who did have UPS sometimes had like 1 minute battery life, just enough to save work, and then go for the aforementioned coffee. ;)



  • @danixdefcon5 said:

    Where I used to work, it was more like "Guy without UPS that had not saved his 5-hour work of his life jumps out the window"
    How I sometimes wish this was actually true.  That way, by Natural Selection, the people who don't learn to regularly save their work (be it a word document or some code) can get culled out.  I was in middle school back in the days of Windows 3.11 and learned to save after every paragraph.  Maybe the Auto-Save feature needs to be eliminated so that people will learn to regularly save for anything they are working on?



  • @WeatherGod said:

    I was in middle school back in the days of Windows 3.11 and learned to save after every paragraph.
     

    Yeah, I think we've all been there back in the early Win95 days in school, losing hours of work cause we were too lazy to save. Since then, my fingers have learned to use idle time (thinking, not typing) to save files. :w<CR>



  • I do exactly the same thing - whenever I finish a thought, I automatically follow it up with Ctrl+S.  This, however, is always lots of fun in environments where Ctrl+S means something other than "save" :)

     

    Like... this edit box, f'rinstance.  Firefox just offered to download a copy of the "Reply to an Existing Message" page for me.  Thanks ><



  • @Albatross said:

    I do exactly the same thing - whenever I finish a thought, I automatically follow it up with Ctrl+S.  This, however, is always lots of fun in environments where Ctrl+S means something other than "save" :)

    Oooh yes. I used to have the instinct of pressing command+S (remember, these combinations came from Mac!) until I was forced to use the Spanish OS versions. These insist in changing the standard combinations to a local equivalent, so Ctrl+S does nothing, as well as Ctrl+F. At least the cut/paste sequences are still the same...

    These days, most of my work involves coding, so I save everytime I *finish* a change. Even if I lose power during some critical change, any boo-boo caused by incomplete code can be fixed with a simple diff with the repository. Oh, and fortunately the IDE is in English!



  • I had a client plug a lamp and a laser printer into the same UPS as their server, and then wonder why the UPS died nearly instantly.

    I fought constantly, over the year that I was on that project, to get someone technically savvy in the building when it came time to install said server + UPS, but we were well over 1000 miles away from the client's locations, and the client was too cheap to fly someone out, or hire someone.



  • @danixdefcon5 said:

    @Albatross said:

    I do exactly the same thing - whenever I finish a thought, I automatically follow it up with Ctrl+S.  This, however, is always lots of fun in environments where Ctrl+S means something other than "save" :)

    Oooh yes. I used to have the instinct of pressing command+S (remember, these combinations came from Mac!) until I was forced to use the Spanish OS versions. These insist in changing the standard combinations to a local equivalent, so Ctrl+S does nothing, as well as Ctrl+F. At least the cut/paste sequences are still the same...

    I wind up accidentally deleting a line when I use emacs for a while, then switch to a windowsy editor where ^X means cut, while unconsciously saving every few seconds.  It surprises me that most people don't learn the 'nervous save' habit early.  Maybe it's because they saved their work on floppies in school.



  •  Reminds me of a place I was in for a short while. It was a smallish company, retailer with 20 or 30 shops around the country and a smallish head office with about 10 "I.T. Guys", of which I was one. 

     We'd been having an issue with an air con unit in the primary server room (which housed pretty much everything that the company needed to work - Exchange, SAP, web site, etc). So a guy comes out to fix it, a member of the I.T. "operations" team brings him into the server room and about 30 seconds later, everything goes down. Every machine in the building loses network connectivity.

    Cue people running in and out of the server room, scratching their heads and panicking, while the rest of the office went to get coffee and bitch about "a whole morning's work, gone".

    Turns out it was the just the network switches which went down, so when they came back up, very little was lost, and only Exchange needed a reboot.

     The WTF is that the member of our I.T. team went into the room, couldn't locate the Air Con unit, so he went over to one of the four UPSes (with a big "UPS" written on it), assumed that it must be the air con ('cause it was huge), lifted up the plastic casing protecting the "Emergency Shut Off" big red button, and pressed it.

    Officially, the air con repair guy was blamed.

    Of course, things like this happened all the time there. One day a developer was working on an older server that wasn't being used for much except his stuff, so he walks into the server room, presses a power switch on a server, before the wailing and gnashing of teeth started outside. Apparently that was his method for rebooting a machine. And of course, he chose the wrong server, despite them being very clearly labelled.

    At least they changed the code on the server room door after that.

     


Log in to reply