[PLUG] Troubleshooting hardware

John Jason Jordan johnxj at comcast.net
Wed May 2 22:23:45 UTC 2007


On Tue, 1 May 2007 20:14:58 -0700
"Quentin Hartman" <qhartman at gmail.com> dijo:

> Try doing some other I/O intensive thing, like "cat /dev/zero > zeroes.dat"
> or something similar to workout the HDD. Perhaps it is a heat issue, but
> it's not in the CPU, but rather the drive controller on the Southbridge...
> 
> Download the Ultimate Boot CD and run MHDD on the drives (hard disk tools,
> 4th or 5th page, item F5. Once in MHDD, select your drive and hit F4 to
> bring up the scan option window and then hit F4 again to start it) as that
> will certainly work them out in a way completely isolated from your
> potentially questionsable system image.

First I downloaded the UCBD and used it to run the Western Digital
Lifesaver Diagnostic tool on each of the hard disks. It ran for 75
minutes each time with the hard drive light on solid the whole time. No
errors on either disk. Then I ran memtest86 for an hour, and no
problems. 

I discovered something while booting the UCBD -- at the start it
presents a simple little word "boot" on the screen and if you don't hit
Enter fast enough it will continue and boot from the hard disk instead
of the CD. One time I was not fast enough and it booted from the hard
disk. The interesting thing is that if I just boot without the CD I get
to the point where it says "GRUB" and then hangs. But if the UCBD
starts first and then it goes to the hard disk, Feisty actually starts
to boot. It crashes to a command line, but the kernel loads.

I figured that meant that maybe there was something in menu.lst that I
could fix. However, I have a dozen or more live CDs of various flavors
and only two will show the md devices when I run cat /proc/dev. One is
GRML Rescue CD and the other is Aaron's rescue CD. Aaron's won't
mount /dev/md1, but the GRML Rescue CD will. However, it does something
weird that I have not figured out. I mounted /dev/md1 at /mnt/md1. When
I do "nano /mnt/md1/boot/grub/menu.lst" I get a menu.lst file, but it
is GRML's menu.lst -- it's definitely not mine. Yet navigating around
in /mnt/md1/ all the rest is mine, that is, it finds /home/jjj and all
the files in it are mine. Weird.

So I started downloading a bunch of other rescue CDs. So far none of
them can see /dev/md*. While doing the downloading I tried
"cat /dev/zero > zeroes.dat" and it powered off the computer after a
couple of minutes. I read what little there is in man cat, but couldn't
figure out what exactly this command does. What does it mean that it
caused a shutdown?



More information about the PLUG mailing list