Tuesday, May 31, 2016

Recovery of RAID and LVM2 Volumes


Raid and Logical Volume Managers are great, until you lose data.
Restoring Access to the RAID Array Members
To recover, the first thing to do is to move the drive to another machine. You can do this pretty easily by putting the drive in a USB2 hard drive enclosure. It then will show up as a SCSI hard disk device, for example, /dev/sda, when you plug it in to your recovery computer. This reduces the risk of damaging the recovery machine while attempting to install the hardware from the original computer.
The challenge then is to get the RAID setup recognized and to gain access to the logical volumes within. You can use sfdisk -l /dev/sda to check that the partitions on the old drive are still there.
To get the RAID setup recognized, use mdadm to scan the devices for their raid volume UUID signatures, as shown in Listing 3.
This format is very close to the format of the /etc/mdadm.conf file that the mdadm tool uses. You need to redirect the output of mdadm to a file, join the device lines onto the ARRAY lines and put in a nonexistent second device to get a RAID1 configuration. Viewing the the md array in degraded mode will allow data recovery:
[root@recoverybox ~]# mdadm --examine --scan  /dev/sda1
 ↪/dev/sda2 /dev/sda3 >> /etc/mdadm.conf
[root@recoverybox ~]# vi /etc/mdadm.conf

Edit /etc/mdadm.conf so that the devices statements are on the same lines as the ARRAY statements, as they are in Listing 4. Add the “missing” device to the devices entry for each array member to fill out the raid1 complement of two devices per array. Don't forget to renumber the md entries if the recovery computer already has md devices and ARRAY statements in /etc/mdadm.conf.
Then, activate the new md devices with mdadm -A -s, and check /proc/mdstat to verify that the RAID array is active. Listing 5 shows how the raid array should look.
If md devices show up in /proc/mdstat, all is well, and you can move on to getting the LVM volumes mounted again.
Recovering and Renaming the LVM2 Volume
The next hurdle is that the system now will have two sets of lvm2 disks with VolGroup00 in them. Typically, the vgchange -a -y command would allow LVM2 to recognize a new volume group. That won't work if devices containing identical volume group names are present, though. Issuing vgchange -a y will report that VolGroup00 is inconsistent, and the VolGroup00 on the RAID device will be invisible. To fix this, you need to rename the volume group that you are about to mount on the system by hand-editing its lvm configuration file.
If you made a backup of the files in /etc on raidbox, you can edit a copy of the file /etc/lvm/backup/VolGroup00, so that it reads VolGroup01 or RestoreVG or whatever you want it to be named on the system you are going to restore under, making sure to edit the file itself to rename the volume group in the file.
If you don't have a backup, you can re-create the equivalent of an LVM2 backup file by examining the LVM2 header on the disk and editing out the binary stuff. LVM2 typically keeps copies of the metadata configuration at the beginning of the disk, in the first 255 sectors following the partition table in sector 1 of the disk. See /etc/lvm/lvm.conf and man lvm.conf for more details. Because each disk sector is typically 512 bytes, reading this area will yield a 128KB file. LVM2 may have stored several different text representations of the LVM2 configuration stored on the partition itself in the first 128KB. Extract these to an ordinary file as follows, then edit the file:
dd if=/dev/md2 bs=512 count=255 skip=1 of=/tmp/md2-raw-start
vi /tmp/md2-raw-start

You will see some binary gibberish, but look for the bits of plain text. LVM treats this metadata area as a ring buffer, so there may be multiple configuration entries on the disk. On my disk, the first entry had only the details for the physical volume and volume group, and the next entry had the logical volume information. Look for the block of text with the most recent timestamp, and edit out everything except the block of plain text that contains LVM declarations. This has the volume group declarations that include logical volumes information. Fix up physical device declarations if needed. If in doubt, look at the existing /etc/lvm/backup/VolGroup00 file to see what is there. On disk, the text entries are not as nicely formatted and are in a different order than in the normal backup file, but they will do. Save the trimmed configuration as VolGroup01. This file should then look like Listing 6.

No comments: