Source: yourblog.org)
RAID technology details
RAID is called the Redundant Array of Disks. It is the abbreviation of "Redundant Array of Independent Disks" (originally "Redundant Array of cheap Disks. In 1987, Patterson, Gibson, and Katz were defined in an article by the UC Berkeley branch. RAID array technology allows you to group A series of disks to achieve data redundancy required for data protection, as well as data strip distribution to improve read/write performance. RAID was initially used in the high-end server market. However, with the rapid development of computer technology, RAID technology has penetrated into various computer fields. Nowadays, RAID control chips are everywhere on home computer boards.
Generally, RAID systems can be deployed on various interfaces. For now, PATA, SATA, and SCSI have corresponding hard disks to form RAID. With the release of the Intel 865/875 series chipset, the hard disk interface in the home market began to switch to SATA, And the RAID method will also transition from PATA to SATA.
With the use of RAID technology, people have experienced a series of changes and developments. In the home market, we can only see RAID 0, RAID 1, and RAID 0 + 1 disk arrays. However, since the birth of the DFI Lanparty motherboard, We have ushered in the fourth disk array method, RAID 1.5.
In practical applications, most of the scenarios where RAID is set up are to further improve disk read/write performance, and data backup can be achieved by other methods (such as burning ). Therefore, people are willing to try RAID 0 with only two hard disks, but the birth of RAID 1.5 has changed this concept. What is the performance of these two relatively inexpensive disk arrays? Let's reveal it to you.
RAID 0:
RAID 0 uses a technology called striping to distribute data to various disks. Where each "band" is distributed to a continuous "block", data is divided into several blocks ranging from 512 bytes to several megabytes, and then written to the disk in turn. 1st are written to disk 1, 2nd are written to Disk 2, and so on. When the system reaches the last disk in the array, it is written to the next segment of Disk 1.
Split data can evenly distribute I/O loads to all drives. Because the drive can be written or read at the same time, the performance is significantly improved. However, it does not have data protection capabilities. If a disk fails, all data is lost. Therefore, RAID 0 is not suitable for key task environments, but it is very suitable for video and image production and editing.
RAID 1:
RAID 1 is also called a mirror because data on one disk is completely copied to another disk. If data on one disk is incorrect or a bad track occurs on the hard disk, the other hard disk can recover data loss and system interruption caused by disk failure. In addition, RAID 1 can also implement Duplex-that is, you can copy the entire controller, so that your data can be protected when disk or controller failure occurs. The disadvantage of mirroring and duplex is that a doubling of the number of drives is required to copy data, but the read/write performance of the system does not increase. This may be a huge expense. RAID l can be implemented by software or hardware.
RAID 2:
RAID 2 is a disk array with Haiming code verification developed for mainframes and supercomputers. 1st, 2nd, and 4th disk drive groups ...... The n-Power Disk Drive of 2nd is a dedicated disk for verification and error correction. For example, RAID 2, 1st, 2, and 4 disk drives (red) created by seven disk drives are error correction disks, and others (purple) are used to store data. RAID 2 has extremely high performance for reading and writing large amounts of data, but the performance of reading and writing a small amount of data is poor, so RAID 2 is rarely used.
Due to the special nature of RAID 2, as long as we use more disk drives, the smaller the percentage of disk verification is. If you want to achieve a better speed and better disk utilization, it is best to increase the hard disk that saves the verification code ECC code, but this requires more hard disk purchase costs to ensure data redundancy. For the controller design, it is simpler than RAID 3, 4 or 5.
RAID 3:
RAID 3 is a strip with a dedicated parity (parity. Each stripe has a space equivalent to a "Block" to store redundant information, that is, parity. The parity is the encoding information. If the data on a disk is incorrect or the disk is faulty, you can use it to restore the data. In a data-intensive environment or a single user environment, creating RAID 3 is advantageous for a long continuous access record. However, like RAID 2, the performance may decrease when the access record is short.
RAID 4:
RAID 4 is an independent disk structure with a parity code. It is similar to RAID 3. The difference is that RAID 4 accesses data by data block. RAID 3 is a horizontal bar, while RAID 4 is a vertical bar at a time. Therefore, RAID 3 usually needs to access all hard drive in the array, while RAID 4 only needs to access useful hard drive. In this way, the Data Reading speed is greatly improved, but in terms of data writing, the old data and new data retrieved from the data hard drive and the hard drive must be verified, then, the updated data and check bit are written to the hard drive, so the processing time is longer than RAID 3.
RAID 5:
RAID 5 is also called a strip with distributed parity. Each strip has a place as large as a block to store parity. Unlike RAID 3, RAID 5 distributes the parity information across all disks rather than one disk, which greatly reduces the load on the parity disk. Despite some capacity losses, RAID 5 provides perfect overall performance and is also a widely used disk array solution. It is suitable for input/output intensive, high read/write ratio applications, such as transaction processing.
In order to have RAID Level 5 redundancy, we need at least three disk arrays. RAID 5 can be implemented either through the disk array controller hardware or through some network operating system software.
RAID 6:
RAID 6 is an independent disk structure with two distributed storage parity codes. It uses the second parity check allocated to different disks to implement enhanced RAID 5. It can withstand multiple drive failures at the same time. However, it takes a lot of time to calculate the parity value and verify the data correctness, resulting in heavy system load and greatly reducing the overall disk performance, moreover, the system requires an extremely complex controller. Of course, since the second parity value is introduced, we need N + two disks.
RAID 7:
RAID 7 comes with an intelligent real-time operating system and a software tool for storage management. It runs independently of the host and does not occupy host CPU resources. The Storage Computer Operating System (RAID 7) is a real-time event-driven Operating System. It is mainly used to initialize the System and arrange all data transmission in the RAID 7 disk array, and convert them to the corresponding physical storage drive. You can use the Storage Computer Operating System to set and control the read/write speed so that the I/O transmission performance of the host can be optimal. If a disk fails, you can perform automatic recovery and manage the rebuilding process of the backup disk.
RAID 7 adopts non-synchronous access, which greatly reduces the Data Writing bottleneck and increases the I/O speed. (Non-synchronous access, that is, each I/O interface of RAID 7 has a dedicated expressconnect, which serves as the circulation path of data or control information, therefore, you can independently control the data access to each disk in your system .) If RAID 7 has N disks, apart from one verification disk (for redundant computing), it can simultaneously process random read/write commands from N-1 host systems, this significantly improves I/O applications. The built-in real-time operating system of RAID 7 can automatically optimize the read/write commands sent by the host, and intelligently read data into the cache in advance, this greatly reduces the rotation times of the head and increases the I/O speed. RAID 7 can help users effectively manage increasingly large data storage systems and improve the system operation efficiency by at least one time, meeting the different needs of various users.
RAID 10 (RAID 0 + 1 ):
RAID 10, also known as the Image array strip. Now we generally call it RAID 0 + 1. RAID 10 (RAID 0 + 1) provides 100% data redundancy and supports larger volumes. RAID 10 (RAID 0 + 1) requires four disks, two of which are strip data distribution, providing read/write performance of RAID 0, and the other two are the images of the first two hard disks, ensures complete data backup.
RAID 30:
RAID 30 is also called a dedicated parity array strip. It has the features of RAID 0 and RAID 3. It is composed of two RAID 3 disks (three disks in each group) and uses dedicated parity bits, these two disks form a RAID 0 array to extract data across disks. RAID 30 provides fault tolerance and supports larger volumes. Like RAID 10, RAID 30 provides high reliability because data is still available even if two physical disk drives fail (one in each array.
RAID 30 requires a minimum of six drives, which are best suited to non-interactive applications such as video streams, graphics, and image processing. These applications process large files sequentially and require high availability and high speed.
RAID 50:
RAID 50 is called a distributed parity array band. Similar to RAID 30, it has the common features of RAID 5 and RAID 0. It consists of two groups of RAID 5 disks (each group has at least three disks), each group uses distributed parity, and the two hard disks are created in RAID 0, the experiment extracts data across disks. RAID 50 provides reliable data storage and excellent overall performance, and supports larger volumes. Even if two physical disks fail (one in each array), data can be restored smoothly.
RAID 50 requires at least 6 drivers. It is ideal for applications that require high reliability storage, high read speed, and high data transmission performance. These applications include transaction processing and office applications that allow many users to access small files.
RAID 53:
RAID 53 is called an ultra Data Transfer Disk structure. The implementation of the structure is the same as the Level 0 Data bar array, where each segment is a RAID 3 array. Its redundancy and fault tolerance capabilities are the same as RAID 3. This is beneficial to systems that require RAID 3 with a high data transmission rate, but it is expensive and inefficient.
RAID 1.5:
RAID 1.5 is a new disk array. It has the RAID 0 + 1 feature. The difference is that it only requires two hard disks. On the surface, the two disks created after RAID 1.5 have the same data. Of course, RAID 1.5 is also a disk array mode that cannot fully utilize disk space. Therefore, two 80 GB hard disks are the same as RAID 1 after RAID 1.5 is created, that is, only 80 GB of space is actually used, and 80 GB is its backup data. If you separate two hard disks and run them on the original system, they are also unobstructed. However, through practical application, we found that if the two hard disks run separately, slight changes in their data will cause the re-restructured disk array, and it is impossible to achieve full data recovery, the disk with less data prevails.
Since RAID 1.5 and RAID 1 have very similar effects, how does it experiment with RAID 0's read/write operations? So far, we have no material to prove the following assumption: the disk array control chip has advanced control functions that allow two disks