Basic knowledge of RAID

Source: Internet
Author: User

RAID is an English redundant array of independent Disks (redundant array of independent disks), referred to as a disk array. Below are the various levels of RAID described below.

First, Why use RAID?

1, high speed access to disk (speed up): Raid the ordinary hard disk into a disk array, the host to write data, the RAID controller to write data to the host to decompose into multiple data blocks, and then write to the disk array in parallel; When the host reads the data, The RAID controller reads the data in parallel on each hard disk in the disk array, and then re-slipstreams it to provide the host. The use of parallel read and write operations, thereby improving the storage system access speed.

2, expansion.

3, data redundancy.

Second, category

RAID can be classified as level 0 to level 6, often referred to as: Raid0,raid1,raid2,raid3,raid4,raid5,raid6.

RAID0:RAID0 is not a real raid structure, there is no data redundancy, RAID0 continuously splits the data and reads/writes on multiple disks in parallel. Therefore, the data transmission rate is very high, but the RAID0 improves the performance, and does not provide the reliability, if one disk fails, it will affect the whole data. Therefore, RAID0 cannot be applied to critical applications that require high data availability.

RAID0 Advantages: The fastest read and write performance , if each drive has a separate controller performance will be better.

RAID0 disadvantage: Any piece of hard disk failure all the data will be lost, most of the controller is software implementation, so the performance is not good.

-----------------------------------------------------------------------

RAID1:RAID1 data redundancy through data mirroring, resulting in mutually backed-up data on two separate disks. RAID1 can improve read performance and can read data directly from the mirror when the raw data is busy. RAID1 is the most expensive in a disk array, but provides the best data availability. When a disk fails, the system can be automatically swapped to the mirrored disk without the need to reorganize the failed data.

RAID1 's Advantages: High data reliability, easy to implement, simple design.

The disadvantage of RAID1 is that it is slower than RAID0, especially the write speed, and that we can only use half of the hard drive capacity.

-----------------------------------------------------------------------

Raid0+1

This raid mode is actually a combination of the RAID0 and RAID1 modes and requires at least 4 hard drives. Any of these two pieces form a RAID0 disk array, and then two RAID0 disk arrays can be considered two larger, faster drives that form a RAID1 disk array. Such a system guarantees high disk performance and high data security. Of course, the disadvantage is also obvious is the high cost, the structure is more complex. Raid0+1 is second only to RAID5 in fault-tolerant performance, and is generally used for file servers.

-----------------------------------------------------------------------

RAID2: Conceptually, RAID2 is similar to RAID3, where the data is striped across different hard disks, in bits or bytes. However, RAID2 uses a coding technique called "Weighted average error correction code" to provide error checking and recovery. This coding technique requires multiple disk storage to check and restore information, making RAID2 technology more complex to implement. Therefore, it is seldom used in a business environment.

RAID2 's Advantages: Data security is high, as long as the hard drive that holds the checksum code can recover the data without failure.

Disadvantages of RAID2: expensive, need special hard disk to store check code, inefficient, no commercial application support.

-----------------------------------------------------------------------

RAID3: Unlike RAID2,RAID3, the parity information is stored using a single disk. If a disk fails, the parity disk and other data disks can regenerate the data. If the parity disk fails, data usage is not affected. RAID3 provides a good transfer rate for large amounts of continuous data, but for random data, parity disks can be a bottleneck for write operations.

RAID3 Advantages: More suitable for video editing, such as the need for large data volume call occasions.

The disadvantage of RAID3 is that it is difficult to synchronize the speed of each drive (which is not supported by most hard drives) and requires a complex controller.

-----------------------------------------------------------------------

Raid4:raid4 and RAID5 also strip and distribute the data on different disks, but the bar units are blocks or records. RAID4 uses a disk as a parity check disk, and each write operation requires access to the parity disk, which is a bottleneck for write operations. Rarely used in commercial applications.

RAID4 Advantages: In addition to the advantages of RAID3, it does not require synchronous drive speed.

RAID4 Disadvantage: Write performance is poor, controller requirements are high.

-----------------------------------------------------------------------

The RAID5:RAID5 does not have an individually specified parity disk, but instead accesses data and parity information on all disks in a interleaved manner. On the RAID5, the read/write pointers can be manipulated simultaneously against the array device, providing higher data traffic. RAID5 is more suitable for small data blocks, random read and write data. The important difference between RAID3 and RAID5 is that each time the data transfer is RAID3, all array disks are involved. For RAID5, most data transfers operate on only one disk, which can be done in parallel. In RAID5, "write loss", that is, every write operation, will produce four actual read/write operations, of which two reads the old data and parity information, two times to write new data and parity information.

The advantages of RAID5: No special check-code disk is required, the reading speed is fast, and the problem of relatively slow writing speed is solved.

The disadvantage of RAID5: write performance is still not satisfactory.

-----------------------------------------------------------------------

Raid6:raid6 adds a second independent parity information block compared to RAID5. Two independent parity systems use different algorithms and the reliability of the data is very high. Even if two disks fail at the same time, it does not affect the use of the data. However, more disk space needs to be allocated to parity information, with a greater "write loss" relative to RAID5. RAID6 write performance is very poor, poor performance and complex implementation makes RAID6 rarely used.

The advantages of RAID6: Fast read performance, higher fault tolerance. Disadvantages of RAID6: Slow write speeds, RAID controllers are more complex and cost-efficient to design.

Detailed Introduction

RAID0

Striped (Stripe) storage. Theoretically, a RAID0 consisting of n disks is n times the read and write speed of a single disk. RAID 0 continuously splits data in bits or bytes, reads/writes on multiple disks in parallel, and therefore has a high data transfer rate, but it has no data redundancy and is not considered a real raid structure.

RAID1

Mirror image (Mirror) storage. It is data redundancy through disk data mirroring, resulting in data that is backed up on paired independent disks. When raw data is busy, data can be read directly from the mirrored copy, so RAID 1 can improve read performance. RAID 1 is the highest unit cost in a disk array, but provides high data security and availability. When a disk fails, the system can automatically switch to read and write on the mirrored disk without having to reorganize the failed data.

RAID2

Hamming Code) Check stripe storage. The data is striped across different hard disks, in bits or bytes, using a "cleartext" to provide error checking and recovery. This coding technique requires multiple disk storage checks and recovery information, making RAID 2 technology more complex to implement and therefore rarely used in a business environment.

RAID3

Parity (XOR) stripe storage, shared check disk, data stripe storage unit bytes. It is very similar to Raid 2, where data is striped across different hard disks, except that RAID 3 uses simple parity and holds parity information with a single disk. If a disk fails, the parity disk and other data disks can regenerate the data, and if the parity disk fails, it does not affect data usage. RAID 3 provides a good transfer rate for large amounts of continuous data, but for random data, parity disks can be a bottleneck for write operations.

RAID4

Parity (XOR) stripe storage, shared check disk, data stripe storage unit is block. RAID 4 also tiles and distributes data across different disks, but the bars are in blocks or records. RAID 4 uses a single disk as the parity disk, and each write operation requires access to the parity disk, where the parity disk becomes the bottleneck of the write operation, so RAID 4 is rarely used in a commercial environment.

RAID5

Parity (XOR) stripe storage, checksum data distributed storage, data stripe storage unit is block. RAID 5 does not specify a single parity disk, but instead accesses data and parity information across all disks. On RAID 5, the read/write pointer can operate against a list of devices at the same time, providing higher data traffic. RAID 5 is more suitable for small data blocks and random read and write data. The main difference between RAID 3 and RAID 5 is that RAID 3 involves all array disks for each data transfer, whereas for RAID 5, most data transfers operate on only one disk and can be performed in parallel. In RAID 5 There is a "write loss", that is, each write operation will produce four actual read/write operations, two reads the old data and parity information, two times write new data and parity information.

When recovering, for example, we need to recover the A0, it is necessary to need B0, C0, D0 plus 0 parity to calculate and derive A0 for data recovery. So when two discs are broken, the entire RAID data fails.

RAID6

Parity (XOR) stripe storage, two distributed storage checksum data, data stripe storage unit is block. RAID 6 adds a second independent parity information block compared to RAID 5. Two independent parity systems use different algorithms, and the data is very reliable, even if two disks fail at the same time without affecting the use of the data. However, RAID 6 requires more disk space allocated to parity information and a greater write loss than RAID 5, so "write performance" is very poor. Poor performance and complex implementations allow RAID 6 to be rarely used in practice.

RAID7

This is a new RAID standard, with its own intelligent real-time operating system and software tools for storage management, can be completely independent of the host running, do not occupy the host CPU resources. RAID 7 can be seen as a storage computer (Storage computer), which differs significantly from other RAID standards.

The raid 7 rating is by far the most theoretically performing RAID mode because it has been significantly different from the way it was formed. The basic form is shown in the past, a hard disk is an array of "pillars", and in RAID 7, multiple hard disks form a "pillar", they all have their own channels, and because of this, you can break this diagram into a hard drive connected to the main channel, just more than the previous level of subdivision. The advantage of this is that when reading/writing data for an area, you can quickly locate it, rather than just having access to a portion of the data area at the same time because of a single hard disk limit, in RAID 7, the previous single hard disk is the equivalent of splitting into multiple separate hard disks with its own read and write channels.

RAID10 and the RAID01 the comparison

-RAID10 is to do the mirror first, and then do the strip.

-RAID01 is to make a band first, and then do the mirror image.

For example, take 6 disks For example, RAID10 is the first to divide the disk into 3 groups of mirrors, and then to the 3 RAID1 do strip. RAID01 is the first use of 3 disk to do RAID0, and then the other 3 pieces of disk as a RAID0 mirror image. The following is a 4-disk example to describe the differences in security:

1, the situation of RAID10

In this case, we assume that when the DISK0 is damaged, in the remaining 3 disks, only when the DISK1 one disk fails, the entire RAID fails, we can simply calculate the failure rate of 1/3.

2, the situation of RAID01

In this case, we still assume that the DISK0 is corrupted and that the left stripe will not be readable. In the remaining 3 disks, as long as any one of the DISK2,DISK3 two disks is damaged, the entire raid will fail, and we can simply calculate the failure rate at 2/3.

So RAID10 is stronger than RAID01 in terms of safety.

From the logical location of the data store, RAID01 and RAID10 are identical under normal circumstances, and the number of IO generated by each read and write operation is the same, so there is no difference between read and write performance. When there is a disk failure, such as the previously assumed DISK0 damage, we can also find that in both cases, in the read performance will be different, RAID10 read performance will be better than RAID01.

Basic knowledge of RAID

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.