Hard disk tango



  • So this is more of a bit of an oddity but a WTF none the less in my opinion.   Our development server crashed in a blaze of glory today.  Both scsi hard disks failed.  What we have out upon our postmortem was quite astounding.  Please if anyone can find some logic in this please let me know.  The system was apparently using raid zero with both disks acting as one drive.  this one drive was then partitioned into two seperate drives.  Please tell me there is a reason for this!!! 

    the good news is we managed to get 2 new servers out of this :)  my company likes to say they Hemmorage money. (We have a very unique contract with the state that makes this attitude possible)



  • The same reason you would partition any drive. Raid 0 will simply make the whole deal faster, if you're only reading/writing to one "drive" at a time.



  • The Real WTF is RAID 0. If you want speed that bad, why not go for RAID 5 instead? You won't lose data if only one disc dies in RAID 5, plus you get something of a speed boost (albeit not as good as RAID 0). To me, the peace of mind that you're probably not going to lose your data outweighs the loss in performance.

    Partitioning a RAID 0 isn't a WTF, since RAID 0 is faster than one hard drive alone.

     /running RAID 1 at home



  • How is RAID 5 a speed boost? It's a lot slower then RAID 0 depending on how many disks we're talking about here. How could writing half the data to each drive be slower then spreading the data and parity on all drives on the bus? 

     

    @Welbog said:

    The Real WTF is RAID 0. If you want speed that bad, why not go for RAID 5 instead? You won't lose data if only one disc dies in RAID 5, plus you get something of a speed boost (albeit not as good as RAID 0). To me, the peace of mind that you're probably not going to lose your data outweighs the loss in performance.

    Partitioning a RAID 0 isn't a WTF, since RAID 0 is faster than one hard drive alone.

     /running RAID 1 at home



  • RAID 5 is faster than a single disc.

     "The read performance of RAID 5 is almost as good as RAID 0 for the same number of disks. Except for the parity blocks, the distribution of data over the drives follows the same pattern as RAID 0. The reason RAID 5 is slightly slower is that the disks must skip over the parity blocks." - Wikipedia



  • @Welbog said:

    RAID 5 is faster than a single disc.

     "The read performance of RAID 5 is almost as good as RAID 0 for the same number of disks. Except for the parity blocks, the distribution of data over the drives follows the same pattern as RAID 0. The reason RAID 5 is slightly slower is that the disks must skip over the parity blocks." - Wikipedia

    That's only part of the story. RAID 5 has considerably higher CPU demands for writing than a single disk, if you don't have a true hardware raid controller (the thing on your motherboard is not one). If you don't have CPU power to spare, it's considerably slower at writing, although it may be slightly faster at reading.



  • I run (4 disk, sata) raid 5 on a machine at home, at it never gets above about 5% cpu usage writing at full speed (about 200MHz).

    I'm using a pseudo-hardware raid controller, a Silicon Image 3114 add-in card that I flashed to the raid-5 capable bios myself. The card can't handle doing the parity calculations itself, so it triggers an IRQ and gets the cpu to do it. I don't think the driver gets involved, because it's possible to set up a bootable raid-5 array (though I'm not using that ability).

    Raid-5 is nearly as fast as Raid-0 when reading, generally about the speed of a one disk smaller raid-0 array (which works out as the same usable capacity). A raid-0 array could read from 3 disks simultaneously (3x the speed of one disk), a raid-5 would have 4 disks, in total reading at 3x the speed of one disk. In a 4-disk raid-5 array every 4th block would be redundancy information that it had to skip when reading, so each disk would be reading at 3/4 normal speed, making the entire array read at 3x the speed of one disk. (Skipping a sector generally takes the same amount of time as reading it, the disk still has to spin the same distance).

    Writes are much slower for small (1 sector) writes, as it has to read all the sectors in the same stripe, recalculate the parity and then write the new sector and parity. For large writes (all the sectors in a stripe) it doesn't need to read the old data, so it will write at the speed of a 1-disk smaller raid-0 set (so a 4-disk raid 5 writes at 3x the speed of a single disk). This is assuming that the raid-5 implementation is good enough to calculate the parity at the same time as doing the writes, and caches small writes for long enough to find out whether it's part of a larger write. Some raid-5 implementations are really bad, and end up thrashing the hard-disk on writes and going incredibly slowly.



  • I was adding to the last post (via edit) when it timed out and wouldn't let me submit my changes :-(

    Here's the extra info:

    The Sii3114 is the same chip as most sata raid-5 motherboards.

    Even degraded (missing a disk) a raid-5 array should
    come pretty close to it's normal speed on a good controller.

    A 4-disk raid-5 set can reach up
    to 200MB/s on large reads, which is more than enough to saturate a
    1Gbps network connection. In fact even a single disk can manage up to
    66MB/s, easily saturating a normal household 100Mbps network connection
    (and making a fair dent in a 1Gbps line). This shows that for a file
    server you wouldn't really worry about the speed of an array, just it's
    capacity and reliability, and from that point of view raid-5 is really
    hard to beat. On the other hand pc ram is on average 3200MB/s (DDR400),
    so for a machine doing a lot of data work itself (eg stupid-huge
    database) that is going to need to continuously swap data in and out of
    ram, you would want all the speed you can get. If destroying your data
    in the case of a failure isn't a problem then go for raid-0, but raid-5
    isn't far behind in speed and it can survive the death of a disk.

     



  • @Thief^ said:

    I run (4 disk, sata) raid 5 on a machine at home, at it never gets above about 5% cpu usage writing at full speed (about 200MHz).

    5% of the total CPU capacity is about right on modern hardware, yes. If you don't have that 5% to spare, the system will grind very slowly.



  • @asuffield said:

    @Thief^ said:

    I run (4 disk, sata) raid 5 on a machine at home, at it never gets above about 5% cpu usage writing at full speed (about 200MHz).

    5% of the total CPU capacity is about right on modern hardware, yes. If you don't have that 5% to spare, the system will grind very slowly.

    It's a hardware interrupt, it WILL get it's 5%. 



  • @Thief^ said:

    @asuffield said:
    @Thief^ said:

    I run (4 disk, sata) raid 5 on a machine at home, at it never gets above about 5% cpu usage writing at full speed (about 200MHz).

    5% of the total CPU capacity is about right on modern hardware, yes. If you don't have that 5% to spare, the system will grind very slowly.

    It's a hardware interrupt, it WILL get it's 5%. 

    Every scheduling operation in most modern operating systems is a hardware interrupt, usually the system clock. All of them "will" get their 5%, just very slowly if there isn't enough to go around. Prioritisation is up to your kernel. It will screw up if it has to triple-schedule disk transfers (once out for the data, once back for the data to receive the parity transform, once out again for the computed parity) and doesn't have enough CPU time to go around.



  • True, but any code it runs will be in system-space, not program-space, and will get priority. Something should only be able to slow down your raid controller if it's another bit of hardware or a driver, software and even windows services shouldn't be able to.



  • Apparently, few of you place any stock in write performance.   RAID-5 performance is much lower than a single disk during write operations larger than its stripe size.  This is why RAID-5 works great with databases (most records fit in a 32-128KB stripe) and not so well with file servers.  Adding more spindles to the array mitigates this as long as the RAID controller's processor can handle it.



  • @Thief^ said:

    True, but any code it runs will be in system-space, not program-space, and will get priority. Something should only be able to slow down your raid controller if it's another bit of hardware or a driver, software and even windows services shouldn't be able to.

    That's entirely dependant on your kernel to implement. Windows doesn't do that (its scheduler is notoriously poor), and Linux takes a different approach (that still doesn't work with raid controllers that have to queue the data multiple times).

    Doing things the way you suggest would just end up stalling high-priority userland processes while low-priority data is pushed through the controller. This wouldn't really be an improvement.



  • @operagost said:

    Apparently, few of you place any stock in write performance.   RAID-5 performance is much lower than a single disk during write operations larger than its stripe size.  This is why RAID-5 works great with databases (most records fit in a 32-128KB stripe) and not so well with file servers.  Adding more spindles to the array mitigates this as long as the RAID controller's processor can handle it.

    You have that the wrong way round, raid 5 writes are poor if the write is SMALLER than the stripe size, because it has to read the rest of the stripe before it can calculate the parity. With larger writes it knows the whole stripe is being replaced so it doesn't need to read the old stripe to calculate the parity correctly.
     

    Basically:

    Small write: Read whole stripe, calculate parity, write new data and parity.

    Large write: calculate parity, write new data and parity.

    The first stalls because it can't calculate the parity until it's read the old data, and then has to re-seek back to the same place to write the new data. In the second it can calculate the parity while the data is in the write queue, with no slow-down. At least, as long as your raid controller isn't so crap it performs large writes using the small write method.
     


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.