Introduction to RAID Technical Specification RAID0 RAID1 RAID2 RAID3 RAID4 RAID5 RAID6 RAID7

Source: Internet
Author: User
Technical Specifications(1) Introduction of RAID technical specification
  The first development of redundant disk array technology was designed to combine small, inexpensive disks to replace large, expensive disks, in order to reduce the cost of mass data storage, but also want to use redundant information, so that the failure of the disk will not lose access to the data, thereby developing a certain level of data protection technology, and can improve the speed of data transfer properly.
RAID has always been a high-end server is destined to enjoy, has been as a high-end SCSI hard disk matching technology for application. With the development of technology and the declining cost of products, the performance of IDE hard disk has been greatly improved, and the popularity of RAID chips has made raid more and more used on PC.
So why is it called a redundant disk array? Redundancy of Chinese meaning is superfluous and repetitive. The disk array description is not just a disk, but a set of disks. At this point you should understand that it is the use of duplicate disk to process the data, so that the stability of the data improved.
(2) The working principle of RAID
Raid how to achieve the high stability of data storage. We might as well look at how it works. RAID is divided into different levels according to the principle of implementation, and there is a difference in the mode of work between different levels. The entire RAID structure is a disk structure that, by combining the disks to improve efficiency and reduce errors, should not be intimidated by so many nouns, and their principles are actually quite simple. For illustration purposes, each of the squares in the diagram below represents a disk, a vertical block or a disk array, which is called a band area.
(3) RAID specification
The main includes RAID 0~raid 7 and several specifications, their focus is different, common specifications are as follows:
RAID 0: Stripe group without error control


To implement RAID0 must have more than two hard drives, RAID0 implements the stripe group, the data is not stored on a hard disk, but is divided into blocks of data stored on different drives. Because the data is distributed across different drives, the data throughput is greatly increased and the load on the drive is more balanced. If exactly the data you need is the most efficient on a different drive. It does not need to compute the checksum code, the realization is easy. Its disadvantage is that it does not have data error control, if the data in a drive is wrong, even if the data on the other disk is correct and useless. It should not be used for situations where data stability is required. It is more appropriate for users to use RAID0 for images (including animation) editing and other occasions where transmission is more demanding. At the same time, RAID can increase the data transfer rate, such as the required read files distributed on two hard drives, the two hard drives can be read at the same time. The original time to read the same file was shortened to 1/2. At all levels, RAID 0 is the fastest speed. But RAID 0 has no redundancy, and if one disk (physical) is corrupted, all of the data is unusable.
RAID 1: Mirror image structure
  
For devices using this RAID1 structure, the RAID controller must be able to read two disks at the same time and write to two mirror disks. You can also see from the following structure diagram that you must have two drives. Because it is the mirror structure in a set of disk problems, you can use the mirror image, improve the system fault-tolerant ability. It is relatively easy to design and implement. The disk can read only one piece of data per read, which means that the data block transfer rate is the same as that of the individual disk. Because the RAID1 is very complete, so it has a great impact on the processing power of the system, the usual RAID function is implemented by the software, and this implementation method can greatly affect the server efficiency when the server load is heavy. When your system requires very high reliability, such as data statistics, then use RAID1 is more appropriate. and RAID1 technology to support the "hot swap", that is, the continuous power of the case of the failure of the disk to replace the replacement, as long as the recovery from the mirror disk data can be. When the primary hard drive is damaged, the mirrored hard drive will work instead of the main hard drive. The mirrored hard disk is the equivalent of a backup disk, and it is conceivable that the security of this hard disk mode is very high and that RAID 1 data security is best for all RAID levels. But its disk utilization is only 50%, the lowest of all RAID levels.
  
RAID2: With sea-plaintext check
Conceptually, RAID 2 is similar to raid 3, where the data is striped across different hard disks, in bits or bytes. However, RAID 2 uses certain coding techniques to provide error checking and recovery. This coding technique requires multiple disk storage checking and recovery information, making RAID 2 technology more complex to implement. Therefore, it is rarely used in a business environment. The disks on the left of the figure below are the bits of the data, and the Hamming check codes obtained by a different bit of data can be saved on another set of disks, as shown below. Because of the characteristics of the sea plaintext, it can be corrected in case of wrong data to ensure the correct output. Its data transmission rate is very high, if you want to achieve a more ideal speed, it is best to improve the Save check code ECC code of the hard disk, for the controller design, it is more simple than raid3,4 or 5. There is no free lunch, here is the same, to use the sea codes, must pay the cost of data redundancy. The rate of output data is equal to the slowest speed in the drive group.
RAID3: Parallel transfer with parity-check codes
  
This check code is different from RAID2, can only check error can not be corrected. When it accesses data, it processes one band at a time, which increases the read and write speed, and it stores the number in parallel, like RAID 0, but not faster than RAID 0. The checksum code is generated and saved on another disk when the data is written. Need to implement when the user must have more than three drives, write rate and read out rate is very high, because the check bit is less, so the calculation time is relatively small. It is very difficult to implement RAID control with software, and the implementation of the controller is not easy. It is mainly used in graphics (including animation), such as the requirements of high throughput ratio of occasions. Unlike RAID 2,raid 3, a single block of disk is used to store parity information. If a disk fails, the parity disk and other data disks may be able to produce the data again. If the parity disk fails, the data usage is not affected. RAID 3 provides a good rate of transmission for a large number of contiguous data, but for random data, odd and even disks can be a bottleneck for write operations. The use of a separate check disk to protect the data although no mirror security is high, but the hard disk utilization has been greatly improved for n-1.
RAID4: Independent disk structure with parity code
  
RAID4 and RAID3 very much like, the difference is that its access to data is based on data blocks, that is, by disk, each time a disk. As you can see in the picture, RAID3 is a horizontal bar, and the RAID4 is once a vertical bar. Its characteristics of the RAID3 also quite like, but in the failure to recover, it can be much more difficult than RAID3, controller design is much more difficult, and access to data is not very efficient.
  
RAID5: Distributed parity independent disk architecture
From its schematic, it can be seen that its parity code exists on all disks, where the p0 represents the No. 0 zone's

Raid 5 parity value, the other meaning is the same. The efficiency of RAID5 is very high, the writing efficiency is general, and the block-type collective access efficiency is good. Because parity codes are on different disks, they improve reliability and allow single disk errors. RAID 5 is also a data parity to ensure data security, but it is not a separate hard disk to store data parity bit, but to the data section of the check bit interaction on each hard disk. In this way, any hard drive that is damaged can reconstruct the corrupted data based on the parity bit on the other hard disk. The utilization of the hard disk is n-1. But it is not good for the parallelism of data transmission, and the design of controller is very difficult. The important difference between RAID 3 and RAID 5 is that RAID 3 involves all the array disks for each data transfer. For RAID 5, most data transfers are performed on a single disk and can be done in parallel. In RAID 5, there is "write loss", that is, each write operation, will produce four actual read/write operations, of which two read the old data and parity information, two times to write new data and parity information. The advantage of RAID-5 is that it provides redundancy (still running after a disk drop), higher disk space utilization (n-1/n), faster read and write (N-1 times). But when the disk is off, the efficiency of the operation is greatly reduced.
RAID6: Independent disk structure with parity-check codes with two distribution stores

The name is very long, but if you look at the picture, you will immediately understand why, please note that p0 represents the parity value of the No. 0 band, and the PA represents the parity value of block a. It is an extension of the RAID5, mainly for situations where the data must never be wrong. Of course, because of the introduction of a second parity value, so the need to n+2 a disk, while the controller design becomes very complex, write speed is not good, used to calculate parity and verify the correctness of the data spent more time, resulting in unnecessary load. I think no one can afford such a thing except the army.
RAID7: Optimized high-speed data transfer disk structure
RAID7 all I/O transfer is synchronous, can be controlled separately, so as to improve the system parallelism, improve the speed of the system access to data; Each disk has a cache memory, and real-time operating systems can use any real-time operation chip to meet the needs of different real-time systems. Allows for management and monitoring using the SNMP protocol, which enables you to specify a separate transfer channel for the calibration area to improve efficiency. Multiple hosts can be connected because the cache is added, and when multiuser accesses the system, the access time is almost 0. Because of the parallel structure, the efficiency of data access is greatly improved. Note that it introduces a high speed buffer memory, which has both advantages and disadvantages, because once the system loses power, the data in the buffer cache is lost and therefore needs to work with UPS. Of course, it's so fast and the price is very expensive.
RAID10: High reliability and efficient disk structure
This structure is nothing more than a band structure plus a mirror structure, because the two structures have advantages and disadvantages, so they can complement each other, to achieve both high efficiency and high speed can also be. You can combine the advantages and disadvantages of both structures to understand this new structure. The price of this new structure is high, the scalability is not good. It is mainly used in databases with small capacity but requiring speed and error control.
RAID53: Efficient data transfer disk structure
More to the back of the structure is a repetition and re-use of the front structure, this structure is RAID3 and band-zone structure of unity, so it is faster, but also fault-tolerant function. But the price is very high, not easy to achieve. This is because all of the data has to go through band and bitwise storage, and it's not easy to require these disks to sync, given the efficiency.
Raid0+1:
Combine RAID0 with RAID1 technology, i.e. raid0+1. In addition to being distributed across multiple disks, each disk has its physical mirror disk, providing full redundancy, allowing one of the following disk failures without affecting data availability and fast read/write capabilities. Requires at least 4 hard drives to make a raid0+1.
(3) JBOD mode
JBOD is often also called span. It is logically connecting several physical disks one by one to form a large logical disk. JBOD does not provide fault tolerance, the capacity of the array equals the sum of the capacity of all the disks that make up spans. JBOD, strictly speaking, does not fall within the scope of RAID. But now many IDE raid control chips have a pattern, JBOD is a simple hard disk capacity overlay, but the system is not in parallel with the processing, write the data is the first write a hard disk, write full again write a second hard drive ...
The most common in practical applications is RAID0 RAID1 RAID5 and RAID10 because in most cases, RAID5 contains the advantages of raid2-4, so raid2-4 basic exit the market
Now, it is generally assumed that raid2-4 is only used for RAID development research

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.