[PLUG] Pause booting?

Thu Nov 20 18:15:42 UTC 2008

On Thu, Nov 20, 2008 at 09:54, Rogan Creswick <creswick at gmail.com> wrote:

> I hope one of the things learned is that software-raid is a royal pain
> in the ass with relatively little benefit.  If you *need* raid, do it
> in hardware so the OS doesn't get this sort of option to muck things
> up.

wow, i totally disagree.  i use software raid to great success in many
situations, and have had no trouble evicting dead disks and inserting
new disks.  i guess i don't "need" raid anywhere i use software
raid--i could just be without computer for awhile.  but in my two
sub-$400 home computers, adding a raid card (not even possible, since
neither have spare slots, both being small-form-factor) would add
significant expense.  in my work computer, well, they aren't buying
and neither am i paying for hardware to use at the office.

furthermore, i've had some nightmares with hardware raid--take for
example the PERC cards in dell servers.  the only reasonable way to
monitor them remotely when using non-redhat/suse pay distros is to
hackishly install tools not intended for debian, etc, and write a
script to poll that software (which is poorly documented at best and
was quite a pain to track down--go ahead, see if you can find the
latest version of MegaCLI from an official download site).  and the
best part?  i came onboard at my current company post-install on one
of these servers where the installer and installed kernel (ubuntu
7.something i think--since been upgraded and fixed) saw the hardware
raid as 3 separate disks--sda, sdb, sdc.  sdc happened to be the
raid-1 mirror of sda and sdb.  whoever did the install didn't notice,
and assumed the install and the OS were using the mirror, when
actually, it was constantly *corrupting* the mirror by writing to the
individual device rather than the virtual mirror.  imagine the
hilarity when i installed a new kernel, rebooted, and got the old
kernel.  lather, rinse, repeat, and eventually i managed to corrupt
both disks by finally writing to /dev/sdc with grub.  yay!  i'll grant
you there was user error combined with software error there (newer
kernels do not expose the two individual disks), but it cost me a
sleepless night spent at the data center and 8 months worth of
webserver statistics (which are now being backed up..)

> Hard disks just don't fail that dramatically, that often.  (If you
> have corruption, that corruption will just be mirrored to the other
> drive unless you happen to notice *and* prevent the mirroring in
> time.)

hmm, i've had 2 disks fail in the 8 or so s/w raids i've setup.  these
are consumer-level disks, not server level, but i'm going with my
personal observed statistics.

> Do you know what to do to make use of your raid? (How will you know
> when you need to do something? What are you *gaining* by having that
> raid?)

i gain some peace of mind and instant automatic recovery without
resorting to backup tapes.