[PLUG] Raving mad RAID

Ben Koenig techkoenig at gmail.com
Tue Feb 2 06:15:12 UTC 2021


On 2/1/21 9:16 PM, TomasK wrote:
> On Mon, 2021-02-01 at 16:19 -0800, John Jason Jordan wrote:
>> About a week ago I finally was successful in creating a RAID0 array
>> on
>> my four NVMe drives that are installed in a Thunderbolt 3 enclosure.
>> After creating the array it appeared in /dev as md0. After rebooting
>> it
>> became md127. I copied the UUID from Gparted and used it in a line
>> that I added to /etc/fstab.
>>
>> The array has been working fine ever since I created it, including
>> copying files to it late last night. This morning I tried to add a
>> torrent for a distro ISO to Ktorrent, and got an error message that
>> Ktorrent couldn't add the torrent because the location to copy it to
>> did not exist. WTH?
>>
>> I looked at my GUI file manager and all the files in the array were
>> listed. I right-clicked on one of them and immediately noticed that
>> Rename and Delete were no longer listed in the options. After a bit
>> more poking around I determined that the array had become read-only
>> overnight.
>>
>> I decided to umount it and then re-mount it. The umount command gave
>> me
>> 'can't read superblock on /dev/md127p1,' which is what /dev/md0
>> became
>> after rebooting a week ago. However, apparently the umount command
>> succeeded, because it was no longer mounted. Then I tried to re-mount
>> it and got the same superblock error message.
>>
>> Looking at /dev I see that most everything has changed. NVMe1-3 now
>> have namespace 2 instead of the 1 that they were when I created the
>> array. And now nvme5-8 are listed, which don't exist. And only
>> nvme4n1
>> had a partition after I created the array, and now it has two
>> partitions.
>>
>> It looks like I'm going to have to nuke the array, re-make it, and
>> wait
>> 24 hours to copy the 10TB of data back to the new array from the NAS
>> backup. But before I do that I need to find out what went wrong.
>> Might
>> there be a defect in one of the NVMe drives? Or might there be a bug
>> in
>> mdadm when it tries to create an array out of NVMe media? Or when the
>> ext4 filesystem was created? I assume that there exists a utility to
>> check a drive, but I've never done that before. Suggestions?
>>
>> I'm considering throwing my computers into the river and doing
>> something useful with my life.
>>
> Perhaps now would be the time to dig out those old emails and consider
> some of the native alternatives rejected in favor of RAID0.
>
> Just saying, -T


Unfortunately it looks like RAID might not be the culprit if his NVMe 
/dev nodes are moving around. RAID0 isn't the cause but it will make 
things more complicated when something fails further down in the stack.


If his system is dynamically naming devices in /dev/nvme* then that 
needs to be dealt with before even thinking about RAID. Not really sure 
where to start looking at that off the top of my head since I was under 
the assumption that this wasn't supposed to happen with NVMe.

-Ben




More information about the PLUG mailing list