Backups shall not DoS the frakking server!


  • Discourse touched me in a no-no place

    So I was scraping ice off my car when the early shift op texts me. "Shits going wrong. Everything is returning an insufficient storage error! There are no monitoring tickets for low disk space!"

    Back into the house, laptop out, VPN on.

    Yep. All writing to the primary Windows fileshare on the primary fileserver. Which is a massive 3tb VM with state of the art 32bit Windows, 2 vCPUs and 4gb of RAM.

    I show 1tb free, so that can't be it. This server is a POS. Last time it was down, it was Symantec Endpoint Protection intercepting DFS traffic as suspect. Time before that, it was because the backups pegged the CPUs and wouldn't yield to actual real work.

    This time, it's not responding to RDP but is responding to some (not all) file accesses.

    Just got an automated ticket for the backup agent failing. Opened one to shout at our glorious sysadmin corps to fix it. Instead of contacting me (the only name on the ticket) he somehow divines that he should talk to our much kinder op instead.

    So I'm getting status updates on my own ticket secondhand. 36 minutes after I submitted my ticket, he finally gives up trying to log in via the VM console (it's just hanging before the desktop) and hits the reboot button.

    To be continued.


  • sockdevs

    this promises to be good.

    :popcorn.gif:

    :-D


  • sockdevs

    @accalia said:

    this promises to be good.

    :popcorn.gif:

    :-D


    Enough to share?


  • Discourse touched me in a no-no place

    Incidentally, I cannot figure out any reason for us to use host based backups. Storage is all SAN, and it's all VM. So snapshots to low tier storage and offload those to tape makes way more sense.

    Also, every host backup agent sucks horribly.


  • sockdevs

    sure is!

    /me hands @RaceProUK some :popcorn.mp4:


  • sockdevs

    @accalia said:

    :popcorn.mp4:

    You must like me a lot if you're giving me your better popcorn :kissing_heart:


  • sockdevs

    what can i say? i'm an old fashioned fox.


  • Discourse touched me in a no-no place

    20 minutes into reboot. Still waiting.


  • BINNED

    @Weng said:

    20 minutes into reboot. Still waiting.

    Ok, I was interested before, now I'm convinced that it's a popcorn situation as well.


  • Discourse touched me in a no-no place

    Yesterday we rebooted another fileserver. At this point they were starting the process to get it restored from tape. No news of that kind yet today, though.


  • sockdevs

    @Onyx said:

    a popcorn situation

    /me hands @onyx :popcorn.png

    sorry i'm fresh out of the animated popcorn. :-D


  • sockdevs

    @accalia said:

    :popcorn.png:

    At least his popcorn has full alpha transparency :laughing:


  • sockdevs

    true., very true. ;-)


  • area_deu


  • Discourse touched me in a no-no place

    Chkdsk....


  • BINNED

    @Weng said:

    Chkdsk....

    It was running chkdsk after reboot, or did you force it into one?



  • What do you mean? pngs can be animated.

    (note: won't work on browsers for babies)



  • @anonymous234 said:

    (note: won't work on browsers for babies)

    Yeah, it doesn't work in Chrome, but it works in Wishes It Was Chrome.


  • sockdevs

    CNR on Chrome for windows.
    CNR on IE11
    CNR on Chrome for Linux Mint
    CR on Firefox for windows (and that pegged one of my processor cores at 100%)
    CR on Firefox for Linux Mint(also pegged processor at 100%)

    @boomzilla, might want to avoid this part of the thread...


  • sockdevs

    @accalia said:

    CR on Firefox for windows (and that pegged one of my processor cores at 100%)

    Didn't peg a processor on my work laptop (Firefox on Windows)


  • sockdevs

    hmm. maybe that's not FF's fault. something else could be interfering (at work and goddess knows Symantec likes to "help")


  • sockdevs

    @accalia said:

    Symantec


    (been a while since I used that image)
    @accalia said:
    itnerfering

    :rolleyes:


  • sockdevs

    @RaceProUK said:

    (been a while since I used that image)

    not my choice. it's a work computer.

    for personal i just let MSE or windows defender (depending on which version of windows i'm running) take care of it for me. combined with staying away from Warez sites and other safe browsing habits they tend to keep me safe enough.


  • Notification Spam Recipient

    @accalia said:

    safe browsing habits

    What is that? Never heard of it.


  • sockdevs

    basically never click on an ad, and stay away from the shady parts of the internet.

    :laughing:


  • Notification Spam Recipient

    Does wearing a condom while watching porn count?



  • @accalia said:

    @boomzilla, might want to avoid this part of the thread...

    Eh...chrome just shows a static picture here. Nothing weird going on. Bounces for me on FF, but CPU usage seemed pretty low.



  • @accalia said:

    CR on Firefox for windows (and that pegged one of my processor cores at 100%)CR on Firefox for Linux Mint(also pegged processor at 100%)

    Gifs crash your browsers, videos crash your browsers, now 64KB animated images are crashing your browsers. Maybe we should start a fund to buy a Nokia 215 to everyone here who can't afford a better internet device.



  • @anonymous234 said:

    Gifs crash your browsers

    None of those ever crashed my browser. But some multi-dozen MB ones made that tab (and only that tab) really crawl.


  • sockdevs


  • sockdevs

    @boomzilla said:

    Bounces for me on FF, but CPU usage seemed pretty low.

    well then that's a local phenomena then



  • www.caniuse.com/apng



  • @Gaska said:

    www.caniuse.com/apng

    As a "how did I tolerate everything else before I tried this" level Opera 11/12 fan, I am highly entertained by the Opera column. Despite the fact that switching browser engines isn't really why I'm still on old Opera (where I can get away with it); I'd rather have Dragonfly than Chrome dev tools (mostly because I know my way around it better) but I admit the engine switch was probably a good move.



  • I would still use Opera 12 if the internet worked with it.


  • BINNED

    @Bort said:

    Wishes It Was Chrome

    Opera Chromiclone? No dice. Worked in 12.

    @accalia said:

    CR on Firefox for windows (and that pegged one of my processor cores at 100%)CR on Firefox for Linux Mint(also pegged processor at 100%)

    And I'm also pretty sure it didn't do that.

    @kilroo said:

    Dragonfly

    OPERA ASA! SELL THIS SHIT AS STANDALONE IF YOU MUST! GIVE ME MY DRAGONFLY BACK! :sob:


  • Discourse touched me in a no-no place

    Automatic chkdsk on reboot. Still going...



  • @Weng said:

    every host backup agent sucks horribly

    That's also been my experience. Which is why every night I shut all five of our VMs down, snapshot all the LVM volumes on the host, start the VMs up again, then rsync the snapshot to two backup targets: one is the old VM host computer that our current server replaced, which sits on the bench next to the new one powered down until the backup script etherwakes it; the other is a tiny ARM server connected to a pair of full-disk-encrypted drives at my house, which I ping-pong with a second pair using a USB dock.

    The copy that goes over gigE to the old VM host takes about half an hour to complete, and the script includes code to make the copied system bootable on the old server. So if the new one dies (which it has done, once, despite having fully redundant everyfuckingthing) all I have to do to get us running again is switch over some Ethernet and USB UPS monitoring cables, remove the USB flash drive that the old server would normally boot from to run these backup jobs, power it up and let it boot the backed-up server image instead.

    After repairing the new server, migrating the old one back to it again is just a matter of running the exact same backup process; the new server gets booted from the same USB flash drive as the old one for use as a backup target.

    The copy that goes over the Internet to my house takes about 9 hours. If the school burns down, bare-metal recovery from that set onto new hardware is easy after booting that new hardware off that same USB flash drive again (of which there is, of course, an image included in the backup set). In fact I used this very procedure when I first set up our new server, so I know it works.

    I'm only running a primary school, not a huge industrial production site, so I can get away with using a single host computer and disk images in ordinary files on top of LVM. But I can see no reason why the same fundamental strategy should be difficult to implement on top of any storage architecture that has some kind of snapshot feature.

    I shut down our VMs before snapshotting the host because I don't have a 24*7 availability requirement; nobody notices if our servers are all down for three minutes in the middle of every night, and four out of the five VMs run Windows, which appreciates a regular nanna-nap. But given any journalled filesystem in the VMs, you could probably get away with not doing that.



  • FYI, if you have a Sony, Sharp or LG SmartTV, you're using Opera.


  • Discourse touched me in a no-no place

    And done. Now I get to root cause the fucking thing.



  • 4 hours to do a chkdsk !
    That's why you should never run your fileserver on a shitty, oups sorry, Windows OS.

    Get a real fileserver : http://www.freenas.org/



  • @TimeBandit said:

    Get a real fileserver

    Or, even better, stop using commodity servers for serving files. Get storage hardware that presents the storage to the LAN directly, like one of the many NetApp products.



  • @Eldelshell said:

    FYI, if you have a Sony, Sharp or LG SmartTV, you're using Opera.

    I have a Daewoo TV in its teen years.



  • @Jaime said:

    Get storage hardware that presents the storage to the LAN directly, like one of the many NetApp products.

    What do you mean ? Like expose it through iSCSI ?
    Like FreeNAS does. You can share with SAMBA, NFS, iSCSI, FTP, etc
    NetApp does not have the monopoly on file server fonctionnality.

    But my comment was more about the actual filesystem. Your NAS should use a journaled filesystem like EXT4, etc.

    At least it won't take hours to scan the filesystem when you bring it back to life


  • Discourse touched me in a no-no place

    I'm pushing for it. Problem is that the OS servers have shitty encryption bullshit on them, and we only just bought a NAS head for the SAN that does encryption at rest, and our security people are not convinced it's enough to use that.

    As it is, we use Windows because I can't convince idiot managers that Linux can do CIFS reliably enough to work.



  • @Weng said:

    As it is, we use Windows because I can't convince idiot managers that Linux can do CIFS reliably enough to work.

    Just make them a demo.
    Setup a Linux fileserver with Samba, on the same network, and join it to the domain. Then upload a huge file to the Windows fileserver then the same file to the Linux one. Time the operation.

    When they see that it takes a lot less time to push it to the Linux server, maybe they will see the light (tm)



  • @TimeBandit said:

    What do you mean ? Like expose it through iSCSI ?Like FreeNAS does. You can share with SAMBA, NFS, iSCSI, FTP, etcNetApp does not have the monopoly on file server fonctionnality

    Sure FreeNAS does it. But any server, whether it's Linux or Windows, is another moving part that just isn't necessary.

    NetApp is the corporate-friendly front runner of a huge class of NAS devices that all look just like file servers to clients on the network. They can simultaneously speak NFS, SMB, CIFS, HTTP, FTP, or a host of other protocols (I'm not including iSCSI because that would make it just another SAN product). I know @Weng works at a very conservative and backwards company, so the big name would make the product more palatable to them. There are a ton of other choices out there in the market, I'm sure there are some that won't send you into BlakeyRage. Sorry for mentioning NetApp - BTW, did they kick your dog or something?


  • Discourse touched me in a no-no place

    You say that like I can just spin up a VM. We have teams for that. They require payment up front for capex and 3 years of "administration" services and backup tapes. Storage, in particular is around a buck a meg.

    That in turn requires budget. Anything over 5k requires VP approval. Our VP is on probation due to rampant incompetence, so everything he does requires CIO approval. Anything of strategic note goes to the board. I already have 2 "I need fucking servers" requests in front of the God damned board. Of a Fortune 500.

    We have an EMC SAN and own a NetApp - alike NAS. Security just hasn't blessed it for anything sensitive. And I need board level approval to move to one of those once I convince them.



  • @Jaime said:

    But any server, whether it's Linux or Windows, is another moving part that just isn't necessary.

    And NetApp doesn't represent another moving part how ?
    @Jaime said:
    There are a ton of other choices out there in the market, I'm sure there are some that won't send you into BlakeyRage.

    I am not into BlakeyRage, :smile:
    @Jaime said:
    Sorry for mentioning NetApp - BTW, did they kick your dog or something?

    Nothing against NetApp, except the price.



  • @Weng said:

    We have an EMC SAN and own a NetApp - alike NAS. Security just hasn't blessed it for anything sensitive. And I need board level approval to move to one of those once I convince them.

    Security hasn't approved a NetApp but they approved a Windows server ?
    You should change your security team :wink:


  • Discourse touched me in a no-no place

    Windows with some shitty third party junk that breaks the file system APIs. I've complained about that, too. It's end of life with no replacement, vendor was bought by a patent troll, and support goes away next month.

    They're still trying to certify encryption at rest solutions to replace it. They have looked at everything on the market and don't like any of them. I have a come to Jesus meeting with them in like an hour where I'm proposing just using the SAN.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.