[PLUG] hardware errors

John Jason Jordan johnxj at comcast.net
Fri May 4 04:33:12 UTC 2007


On Thu, 03 May 2007 18:49:34 -0700
"Christine Navarro" <christine.navarro at verizon.net> dijo:

> Perhaps you've already removed this possibility by setting your BIOS to no 
> errors, but I had a computer a while back that would shut down.  There was a 
> temperature setting in the BIOS that would shut down the machine when the 
> CPU reached a certain temp...

Not to pick on Christine, but I must make something clear. It cannot be
heat. It happens when the machine is stone cold. Plus the BIOS settings
for temperature warning and shutdown are dsabled. 

It happened *once* when formatting a large partition. However it
happens repeatedly and reliably when creating a large RAID 1 partition.
I got it to the point where Feisty will load and run normally. No
sudden shutdowns -- over an hour of using it for all kinds of things.
In order to accomplish this I deleted the /dev/md3 RAID 1 array that I
had attempted to create previously. 

Once I was sure Feisty was finally stable I used Gparted to create two
new unformatted partitions, a matching one on each of the two hard
drives, sda and sdb. Each partition is 238 MB, sda4 and sdb4. Then,
still using Gparted, I set the flags on each partition to RAID. Then I
shut down Gparted because that is as far as Gparted can go in creating
a RAID. For the rest you need mdadm. So here is the code for what I did
next with mdadm:

jjj at Devil6:~$ sudo mdadm --create /dev/md3 --level=1
	--raid-devices=2 /dev/sd[ab]4 jjj at Devil6:~$ password
jjj at Devil6:~$ <messages about finding sda4 and sdb4>
jjj at Devil6:~$ Create array? (Y/N) y
jjj at Devil6:~$ Created array
jjj at Devil6:~$ 

Now, there is a little documentation problem with the above code. Mdadm
doesn't say so, but after it says "Created array" it is not finished.
It will now proceed to do mkfs on /dev/md3, and that will take about
1.75 hours. The hard drive light will stay on constantly until it is
finished. 

Four or five minutes later I get a shutdown.

If I reboot any version of any distro, the mnute it finishes booting,
the mkfs command automatically starts up again. Of course, because of
the shutdown the previous efforts are wiped out and it must start
formatting /dev/md3 over again from the start. The hard disk light is
on constantly.

Four or five minutes later I get a shutdown.

If i reboot i can get Feisty to run perfectly normally by opening a
terminal and typing "sudo mdadm -S /dev/md3." That stops it from
continuing to format and sync /dev/md3. The hard disk light goes out.
Feisty then continues to run perfectly normally for as long as I want
to keep it running.

So something is now becoming clear to me. I think I don't have a real
problem with any of the hardware. Where I have a problem is using Linux
software RAID on these drives. A wild guess is that it has to do with
the fact that it is essentially formatting both disks at the same time
and making sure the blocks are synced. The drives are not liking this. 

Consider also that at one time before I discovered that you cannot set
a RAID flag on a partition once it has been formatted, I created the
238 GB partition and had Gparted format it (a single drive at a time).
The formatting proceeded and ended normally.

I would suggest that it's a bug in mdadm, but "incompatibility" might
be more accurate. Perhaps there's something about these drives that
mdadm cannot handle. Yet, I did create a 2 GB RAID 1 swap, a 20 and a
40 GB regular RAID 1 partition, and they proceeded normally. But when I
created those I used the GUI in Etch and Feisty Alternate,
respectively. I'm assuming the GUI is a front end for mdadm, so perhaps
the reason those proceeded was because they were smaller. I don't know.
I just know that I have finally isolated exactly when it is happening.
I think for my next exercise I need to google on bugs for mdadm and
perhaps communicate with its developers.

I hope the meeting went well tonight. I have an early morning class and
needed to spend some time on a paper so I could not go. Thanks to all
who offered suggestions. And if anyone has any further thoughts about
mdadm, please let me know.



More information about the PLUG mailing list