362
■
Chapter Six
Storage Systems
Advanced Topics in Disk Arrays
An innovation that improves both dependability and performance of storage sys-
tems is
disk arrays
. One argument for arrays is that potential throughput can be
increased by having many disk drives and, hence, many disk arms, rather than
fewer large drives. Simply spreading data over multiple disks, called
striping,
automatically forces accesses to several disks if the data files are large. (Although
arrays improve throughput, latency is not necessarily improved.) As we saw in
Chapter 1, the drawback is that with more devices, dependability decreases:
N
devices generally have 1/
N
the reliability of a single device.
Although a disk array would have more faults than a smaller number of larger
disks when each disk has the same reliability, dependability is improved by add-
ing redundant disks to the array to tolerate faults. That is, if a single disk fails, the
lost information is reconstructed from redundant information. The only danger is
in having another disk fail during the
mean time to repair
(MTTR). Since the
mean time to failure
(MTTF) of disks is tens of years, and the MTTR is mea-
sured in hours, redundancy can make the measured reliability of many disks
much higher than that of a single disk.
Such redundant disk arrays have become known by the acronym
RAID,
stand-
ing originally for
redundant array of inexpensive disks,
although some prefer
the word
independent
for
I
in the acronym. The ability to recover from failures
plus the higher throughput, either measured as megabytes per second or as I/Os
per second, makes RAID attractive. When combined with the advantages of
smaller size and lower power of small-diameter drives, RAIDs now dominate
large-scale storage systems.
Figure 6.4 summarizes the five standard RAID levels, showing how eight
disks of user data must be supplemented by redundant or check disks at each
RAID level, and lists the pros and cons of each level. The standard RAID levels
are well documented, so we will just do a quick review here and discuss
advanced levels in more depth.
■
RAID 0
—It has no redundancy and is sometimes nicknamed
JBOD
, for “just a
bunch of disks,” although the data may be striped across the disks in the array.
This level is generally included to act as a measuring stick for the other RAID
levels in terms of cost, performance, and dependability.
■
RAID 1
—Also called
mirroring
or
shadowing,
there are two copies of every
piece of data. It is the simplest and oldest disk redundancy scheme, but it also
has the highest cost. Some array controllers will optimize read performance
by allowing the mirrored disks to act independently for reads, but this optimi-
zation means it may take longer for the mirrored writes to complete.
■
RAID 2
—This organization was inspired by applying memory-style error cor-
recting codes to disks. It was included because there was such a disk array
product at the time of the original RAID paper, but none since then as other
RAID organizations are more attractive.