Disaster recovery drill

Weng

Preparation phase includes steps like "allocate sufficient SAN to allow for replication." and "Ship tapes to DR site".

blakeyrat

Uh. Ok?

Google's best guess for me on "DR" is: Washington State Department of Retirement Systems

Weng

Disaster recovery. As in failover. It's even in the title.

Do I also need to define SAN?

blakeyrat

No, I knew SAN. It might help me to know what the hell this thread is about, though. What's the WTF?

Weng

In order for a disaster recovery site to work, it needs a copy of whatever data was just disastered away into the aether. Allocating enough space to replicate that data and getting physical copies shipped for things that cannot replicate are things you do when setting up a DR site, not when testing it.

Why? Because when a real disaster occurs you don't have the ability to do those things anymore because the originals are gone.

blakeyrat

That's why it's under the heading "preparation phase".

Am I crazy here? WTF.

boomzilla

Yeah. Presumably in real DR, you'd use whatever your last backup / replication was. In a test, you probably want to keep going and not lose any actual data. But I guess it depends on whether you plan to use your DR stuff as production for the test or if you're just testing other stuff.

Magus

@Weng said:

Do I also need to define SAN?

Of course not! SAN is the measure of your current sanity in the game of Call of Cthulu you're playing!

Weng

Preparation for the test exercise. Not the multi zillion dollar DR system itself.

Weng

Just played Arkham Horror a few days ago. Fantastic stuff.

blakeyrat

@Weng said:

Preparation for the test exercise. Not the multi zillion dollar DR system itself.

Protip: I am not telepathic.

locallunatic

AKA the best dump stat.

DCRoss

That's a familiar approach to disaster planning.

"So if the Russians are to invade, we'd prefer them to do it between Mondays and Fridays?"

boomzilla

Worked for us with the Jap...oh.

cheong

@Weng said:

Do I also need to define SAN?

My SAN value drops as you mentioned it.

cheong

Btw, it's kind of interesting to see most of DR plans simply stops when the DR site is up and working, but almost never mention how to stop the DR site and merge data back when the production site were up again.

Once upon a time, a client of one of my ex-company had their server downed. The DR plan executed successfully. But when the server is back, half of their staffs (staffs that locates on the main building) found they enter the data on the production server, and the staff located elsewhere found that they're still entering (and querying) data to the DR site. The situation was a mess...

ben_lubar

@cheong said:

Once upon a time, a client of one of my sex-company had their server downed. The DR plan executed successfully. But when the server is back, half of their staffs (staffs that locates on the main building) found they enter the data on the production server, and the staff located elsewhere found that they're still entering (and querying) data to the DR site. The situation was a mess...

The Discourse Recovery thread is

Planar

@blakeyrat said:

Protip: I am not telepathic.

It's funny, when you utter something ambiguous it's our fault for getting it wrong, but when somebody else does it, it's not your fault.

lolwhat

IIRC? WTF is it w/ u people & all ur abbrevs?!?

riking

Wait, that's not a question.

aliceif

WTF DOES WTF MEAN HELP

RandomStranger

@cheong said:

Btw, it's kind of interesting to see most of DR plans simply stops when the DR site is up and working, but almost never mention how to stop the DR site and merge data back when the production site were up again.

Oh, that's easy. You just declare DR to be the new Prod, and Prod to be the new DR.
Then the next time a disaster happens, you just execute the DR plan in reverse.

cheong

That only works if your DR site equipment is identical to the production server.

In the case I mentioned, the production site has 3 webserver load balanced with 2 database servers, while the DR site has 1 web server plus 1 database server only.

Worse, the production servers got SAN and the DR servers are using NAS for file storage. The site response time difference is huge.

Filed under: You get what you paid for

RandomStranger

I was mostly joking, although (considering there are plenty of "experts" of a certain caliber out there that might actually consider this sort of thing a good idea - as this site keeps providing evidence for on a daily basis) I admit I may have been too subtle about it.

Scarlet_Manuka

@cheong said:

That only works if your DR site equipment is identical to the production server.

Isn't TRWTF not having the same equipment at your DR site as at your production site? Virtualisation has made this easier, as long as you're maintaining the DR images properly.

ggeens

Many years ago, my coworkers were assigned to a DR drill. A virgin machine - just to OS - was provided off site and they had to get the application running in a day. It became a very long day.

When they came back, I asked how it went.

"Excellent: It was a complete disaster."

The people who knew how much this drill had cost were not amused.

cheong

The server has some periodic tasks that takes 8+ hours to run on those 4 quad-core Xeon servers. You won't want to run it on virtualized hardware at the time.

Actually TRWTF found on the hardware setup is to use tapes to do the backup.

Scarlet_Manuka

In that case, this

@cheong said:

the production site has 3 webserver load balanced with 2 database servers, while the DR site has 1 web server plus 1 database server only.

seems like a really bad idea.

PleegWat

@cheong said:

Actually TRWTF found on the hardware setup is to use tapes to do the backup.

Isn't tape storage, at scale, still the most efficient in a bytes-per-cubic-meter and bytes-per-dollar sense?

Jaime

@Scarlet_Manuka said:

Virtualisation has made this easier, as long as you're maintaining the DR images properly.

On the contrary, virtualization has made it easier to buy a single underpowered VMware host and call it the DR solution for your entire datacenter. In the "bad old days", you needed identical hardware just to get your bare metal backups to restore properly.

Weng

Yeah.

The DR for my application drops all the redundant cluster members because, apparently, replicating application VMs that basically never change and store no data are impossibly expensive to replicate.

cheong

It should be, but they somehow had chosen to go with 20GB per cassette only, that means for each week, multiple stack of tapes have to be sent to offsite backup area.

cheong

Seems the contract also said "four nines" availability, maybe that somehow convinced the management it somehow isn't a problem.

The DR server did act as warm standby, though.

Scarlet_Manuka

@Jaime said:

On the contrary, virtualization has made it easier to buy a single underpowered VMware host and call it the DR solution for your entire datacenter.

Hmm. OK, I'll amend my statement: Virtualisation has made this easier for my company, and for some reason which is now inexplicable to me, I naïvely thought that most companies would probably be doing things the right way. (Absent the single word "underpowered", I don't see a problem with your statement, if your production environment is running the same way.)

@Jaime said:

In the "bad old days", you needed identical hardware just to get your bare metal backups to restore properly.

That was actually my point. Not having to maintain two identical collections of obsolete boxes is precisely the thing that virtualisation has helped us with. Having the weird stuff as VM images is easier.

ben_lubar

See if you can convince them to switch to "nine fours".

cheong

No longer work at there, so none of my business now.

Jaloopa

@PleegWat said:

Isn't tape storage, at scale, still the most efficient in a bytes-per-cubic-meter... sense?

In these days of 128GB Micro SD cards?

accalia

apparently SONY makes 185TB tapes...

$${185*1024}/128 = 1480$$

so one tape is equivalent to 1.48k microsd cards

the tape appears to be about .5"x3"x4"

which with some fancy math gives us 6 cubic inches

according to wikipedia and a little math a microsd card is ~=0.01 cubic inch.

1.48k of those gives us ~14.77 cubic inches.

looks like tape still wins!

loopback0

I think one of the HDD manufacturers managed better than that.

@accalia said:

the tape appears to be about .5"x3"x4"

That's not a very long tape.

accalia

the tape cartridge looks to be about that big.... i'm trying to find specs on that now, but SONY's press release is rather bare on that detail so i'm estimating based of of the image in the gizmag article i found by googleing

May 6, 2014 / Computers

Sony's new magnetic tape technology enables 185 TB cartridges

One of the joys of old science fiction movies is watching the giant reel-to-reel tape drives spin around as they serve computers less powerful than a modern wristwatch. But magnetic tape isn't just something found in old UFO episodes; it’s a key component in modern digital systems required to keep...

loopback0

BTW - the Sony tape:

The tape hold 148 gigabits (Gb) per square inch - beating a record set in 2010 more than five times over.

Toshiba's high density HDD:

Toshiba has managed to hit one terabit per square inch (1Tbit/in2)

loopback0

I'm not sure they've actually made them yet, just announced that they could.

I think you might be right with the dimensions of a normal tape cartridge. I thought they were larger. TIL.

accalia

@loopback0 said:

Toshiba's high density HDD:

great. now i have to do the math on those too!

@loopback0 said:

I'm not sure they've actually made them yet, just announced that they could.

fair enough.... but if they can and i'm right on dimensions....

@loopback0 said:

I think you might be right with the dimensions of a normal tape cartridge.

i am guessing based on the image... and assuming that the hand holding the pictured cartridge is roughly standard sized.

@loopback0 said:

I thought they were larger. TIL.

Honestly? so did i.

dkf

What's the stability of the information? Tape's not much good if you can't get the data out again properly when you want it.

accalia

@dkf said:

What's the stability of the information?

-shrug- Not sure it's only an abstract, but that being said there would be three orders of magnitude fewer tapes than microSD cards, and the tapes are WAY easier to manipulate than a shoebox full of MicroSD cards....