Continue Reading This Article
Enjoy this article as well as all of our content, including E-Guides, news, tips and more.
When it comes to RAID-5 data recovery, you're assuming that you need two drives out of a three-drive set in order to restore all your files. But the key word here is "all." If files are below a certain size, useful data can be recovered from just one disk. Let me explain by examining how RAID-5 stores your data.
Fundamental to RAID-5 is data striping. When your computer saves data to a RAID-5 array of disks, the data is divided up into segments, and the segments are written across the drive array in sequence. So, for example, the first 32 KB would be written to disk one, the next 32 KB would be written to disk two, and so on. Similarly, when a computer reads a file, the multiple pieces of data from each disk drive are extracted and reassembled to create the file.
Stripe size refers to a single data unit that is written to each disk. The performance of a RAID-5 array can be tuned by finding a stripe size that is well-matched to the type of application being used. For example, on-demand video services or data-intensive applications that access large records should use small stripes so that each file or record will span across all the drives in the array. If the data transfer occurs across multiple drives, large amounts of data can be accessed at a greater speed.
RAID-5 also uses distributed parity. Parity is a fault-tolerance feature that deals with error detection. Parity data is stored and distributed among the drives, and when one drive fails, parity information can be used to rebuild the data on the disk.
Larger files will be saved across the disks in your RAID-5 array, and in a three-set array, you would need two disks to recover those files. But what about smaller files, or pieces of data within larger files? A malicious hacker, for example, may only want the username and password from an email, not the rest of the message. The figure below shows how files of different sizes could be distributed across drives in a four-disk RAID-5 array. If drive 2 were to fail, you can see that certain data would be accessible if the stripe size used is 16 KB. File 1 is 4 KB and therefore fits entirely onto drive 2, while the contents of File 2, which is 20 KB, almost fit onto one drive as well. A low-level disk reader would be able to read all of File 1 and segments of the other files! Therefore, it's necessary to treat a failed drive with the same care that you would any other data drive.
Figure 1: File distribution in a four-disk RAID-5 array
This was first published in May 2007