RAID 1 with offline elements

This post is mostly for my own reference. I am posting it here to possibly aid someone else in the future.

Backing up computers is a complex topic- entire books have been written on the subject. Check the wikipedia article for a good introduction to the topic. One factor to weigh heavily when choosing a backup method is that hard drives are fairly fragile mechanical devices which have an expected lifetime and sometimes fail prematurely. Magnetic tape is more durable, but doesn't allow quick random seeks into the data set. Optical discs are the most durable, but their capacity hasn't kept up with the high capacity of magnetic disks. All things considered, the magnetic disk form factor is a relatively attractive choice. Another benefit is that magnetic disks are available in external form factors (firewire, USB, eSATA) that allows for quick and ad-hoc computer connections. If magnetic disks are selected as the storage medium, then addressing their fragile nature is a top concern. One way to ensure safety is to use redundancy, that is, it is important to keep multiple copies of the data on multiple physical devices. To that end, this discussion addresses methods for duplicating a data set to multiple physical devices.

Let's jump in with a concrete example. Let's assume I need to have the contents of the data on disk partition sdx1 duplicated to one or more other disk partitions. I have a suitable target device, named sdy1. To further simplify the problem, I will also assume that there are no other complications with open files- the data is completely static. To meet that assumption, I will specify that both sdx1 and sdy1 are dismounted. The obvious choice to perform the copy is to use the dd command:

# dd if=/dev/sdx1 of=/dev/sdy1 bs=4M

Upon successful completion of that operation, I will have two copies of the data. I have the original copy on sdx1 and a new copy on sdy1. For data safety, I can remove either device and move it to a safe location. I can place it in a drawer; I can move it physically offsite*; I can contract with a data storage firm to store the device for me; etc.

The above command is certainly simple, but simplicity has a downside, too. In the case of the "dd" (duplicate device-to-device) command, the downside is that it simply rewrites the entire contents of sdx1 to sdy1 in bulk. The time required to execute the command will be proportional to the write speed of the device and its capacity. The larger the capacity the longer the time to write. Today's TB size disks will take several hours to copy completely. The obvious question to ask is whether it makes sense to rewrite the entire copy if only a few bytes have been changed. Is another method available which can optimize the copy operation, possibly reducing the time required from hours to minutes? The following discusses one possibility using a RAID 1 array.

RAID 1 : simple mirroring

The key piece to solving this problem uses the fact that RAID 1 can be setup with a write-intent bitmap. Let's proceed with a concrete example. I will work with three identical devices: sdx1, sdy1, sdz1. We'll proceed assuming they are blank, fresh from the store.

Initial Setup

Device Identification

Because the devices are assumed to be identical, it is a good idea to identify them in some way. Linux includes tools to read the device serial number, etc. A simple way to identify disks is to connect each new device one at a time. Identify the disk, create a partition table, and record the unique identifiers. It is up to you to determine exactly how you want to do that. Possible tools are smartctl, hdparm, etc. Here is some sample output.

# hdparm -i /dev/sdw
/dev/sdw:

	Model=Hitachi HTS541060G9SA00, FwRev=MB3OC60R, SerialNo=MPBCP0XGJW6PJM
	:
	:

Disk partitioning

Partition the new drives (not shown). Use partition type, 0xFD.

Initial random fill

Because we will be using encryption, you may want to prefill the disk with random data. The general idea is not to leak information about used/unused portions of the disk. This step is not shown.

Create RAID 1 Array

With all drives connected, previously partitioned (not shown), proceed to create a RAID 1 array with an internal write-intent bitmap.

# mdadm --create /dev/mdx --level 1 --raid-devices=3 /dev/sdx1 missing missing
# mdadm /dev/mdx -Gb internal

The other mirror devices are not initially joined to the RAID array, but can be joined at user convenience/preference using this command:

# mdadm --manage -add /dev/mdx /dev/sdy1
# mdadm --manage -add /dev/mdx /dev/sdz1

Check the progress of the sync using the proc interface:

# cat /proc/mdstat

If your disk controller is not state of the art, then it may be better to delay the above action until later.

Setup encryption layer

Linux includes block level encryption via its device mapper interface. A complete discussion of encryption is beyond the scope of this discussion, but it is shown here to note where the layer exists in the device stack. The RAID layer is created from raw devices. The encrypted layer is created on top of the RAID layer. This command creates an encrypted container with a specified key and other default encryption parameters.

# cat key | cryptsetup create emdx /dev/mdx

Format the encrypted layer with a filesystem

I have been using the XFS filesystem for quite a while. Use whatever filesystem that you are comfortable with. Here is a typical format command:

# mkfs.xfs /dev/mapper/emdx

Mount the encrypted device

Here is a typical mount command:

# mount /dev/mapper/emdx /mnt/emdx

I should remember to add a graphic to this page which illustrates the device "stack" right here.

Integrate the RAID array in your backup scheme

Copy initial source data to target device

The method used to actually backup target data is highly variable depending on many factors. For this example, let's assume the initial point-in-time backup can be obtained with a simple solution. Here is sequence using two locally mounted filesystems: one with the source data and one with the device for the copy.

# SRC=/mnt/source_data
# DEST=/mnt/emdx/snapshot.2011-01-21
# mkdir $DEST
# (cd $SRC && tar -cpf - .) | (cd $DEST && tar -xvf -)

At this point, add the "missing" elements of the RAID array (that is, if there are any "missing" elements.) Wait for the operation to complete. Check with the proc interface as before.

Use a "rolling snapshot" backup method to make the next backup

Again, the actual method for making this backup varies a lot. I show a simple method that uses hardlinks combined with the rsync command.

# SRC=/mnt/source_data
# PRV=/mnt/emdx/snapshot.2011-01-21
# DEST=/mnt/emdx/snapshot.2011-01-28
# mkdir $DEST
# cp -anl ${PRV}/. $DEST
# rsync -lptrv --delete --dry-run ${SRC} ${DEST}

Remove element(s) of the RAID array

We need to move the data that has been backed up offsite. RAID 1 elements can be removed from active arrays.

# mdadm --manage --fail /dev/mdx /dev/sdz1
# mdadm --manage --remove /dev/mdx /dev/sdz1

Once removed, the device associated with sdz1 can be taken offline and offsite for extra data safety. The RAID array remains active with one missing device.

Make one or more "rolling snapshots" as time advances

# SRC=/mnt/source_data
# PRV=/mnt/emdx/snapshot.2011-01-28
# DEST=/mnt/emdx/snapshot.2011-02-04
# mkdir $DEST
# cp -anl ${PRV}/. $DEST
# rsync -lptrv --delete --dry-run ${SRC} ${DEST}

These jobs can be scheduled in advance using cron or at.

Rejoin element(s) to the RAID array

Return the disk drives from offsite storage and reconnect to the computer. Rejoin the disk drive to the drive array and wait for the automatic resync to complete.

# mdadm --manage --re-add /dev/mdx /dev/sdz1
# cat /proc/mdadm

Lather, rinse, repeat.

The device sdz1 can be removed again after the RAID resync operation completes. Another idea would swap in a series of devices. RAID 1 seems to be adaptable to a simple alternating scheme. That is, the first week sdz1 is brought back from offsite, the next week sdy1 is brought back, and so on.

Benchmarks

to be determined

Final Notes:

Disclaimer: Ensure all commands are applicable to your system before proceeding. Adapt commands as necessary for your system. Use at your own risk!

The rsync commands are shown using the dry-run option. Rsync can be a bit tricky to get right. Be very sure that the rsync command options are correct and will do what you want before running with the progress option.

Thanks for the answers to my question on comp.os.linux.misc

Thanks for the info in the Gentoo Software RAID wiki

Using a write-intent bitmap along with encryption may aid attackers to determine encryption keys/methods, especially if an adversary can track many disk changes over time. Personally, I am not worried about that attack vector, but your risk assessment and security needs may be different.

I have a few GNU/Linux articles on my blog here.

Page Last Modified: 2011-01-25