Introduction to raid technology categories

Source: Internet
Author: User
Tags disk usage ide hard drive

Overview raid is the abbreviation of "Redundant Array of independent disk ".Independent redundant disk array. The redundant disk array technology was born in 1987 and was proposed by the University of California at Berkeley.
  Raid disk array (Redundant Array of Independent Disks)
A simple explanation is to combine n hard disks with RAID Controller (hardware, software) into a single virtual large-capacity hard disk, the feature is that N hard disks speed up reading at the same time and provide Fault Tolerant tolerant. Therefore, raid is used as the storage that usually accesses data rather than backup solution.

The basic concept of raid is edap (Extended Data availability and protection). It emphasizes scalability and fault tolerance mechanisms, and is also a vendor such as mylex, IBM, HP, Compaq, adaptec, the focus of demands such as infortrend, including handling the following actions without downtime:
Raid disk arrays Support automatic detection of faulty hard disks;
Raid disk arrays support rebuilding Hard Disk Bad track data;
Raid disk arrays support hot spare for hard disk backup without downtime;
Raid disk arrays support replacement of Hot Swap with hard disks that do not require downtime;
Raid disk arrays support expansion of hard disk capacity.
Once the raid array fails, the hardware service provider can only reinitialize or rebuild the raid array, so that the customer data cannot be recovered. Therefore, if you recover data from RAID 0, RAID 1, RAID 5, and raid arrays in combination, you will have the opportunity to recover data from the faulty raid disk array if the array is not initialized.

Technical Specifications

  (1) Introduction to raid specifications
  
The purpose of the redundant disk array technology was initially developed to combine small and low-cost disks to replace large and expensive disks to reduce the cost of mass data storage. At the same time, we also hope to adopt redundant information methods, this prevents access to data from being lost when the disk fails. In this way, a certain level of data protection technology is developed and the data transmission speed can be appropriately improved.
In the past, raid has always been available to high-end servers and has been used as a supporting technology for high-end SCSI hard disks. Recently, with the development of technology and the decline of product costs, the performance of IDE hard drives has been greatly improved. In addition, the popularity of RAID chips has gradually enabled raid applications on PCs.

So why is it a redundant disk array? Redundant Chinese means redundant and repetitive. A disk array is not just a disk, but a group of disks. At this time, you should understand that it uses duplicate disks to process data, improving data stability.

  (2) How raid works
How does raid achieve high data storage stability? Let's take a look at how it works. Raid is divided into different levels according to different implementation principles. The working modes of different levels are different. The entire raid structure is a number of disk structures, which can be combined to improve efficiency and reduce errors. Don't be frightened by so many terms. Their principle is actually very simple. For ease of instruction, each square below represents a disk. The vertical block or disk array is called a band.
  (3) raid specifications
Mainly including RAID 0 ~ Raid 7 and other specifications have different focuses. common specifications include:
  RAID 0: Error-free zone
To implement raid0, you must have more than two hard drives. raid0 implements a zone block. Data is not stored on one hard disk, but stored in different drives as data blocks. Because data is distributed on different drives, the data throughput is greatly improved, and the load on the drive is also balanced. If the desired data is the most efficient on different drives. It does not need to calculate the Verification Code and is easy to implement. Its disadvantage is that it has no data error control. If the data in a drive is wrong, it will not help even if the data on other disks is correct. It should not be used in scenarios with high data stability requirements. If you want to edit images (including animations) and use raid0 for other scenarios that require relatively large transmission. At the same time, raid can increase the data transmission rate. For example, the files to be read are distributed on two hard disks, which can be read at the same time. Therefore, the time for reading the same file is reduced to 1/2. At all levels, raid
0 is the fastest speed. However, RAID 0 does not have redundancy. If a disk (physical) is damaged, all data cannot be used.
  Raid 1: Image structure
For devices using this raid 1 structure, the raid controller must be ableRead the two disks and write the two image disks at the same time.. The following structure chart shows that there must be two drives. Because the image structure can be used to improve the fault tolerance capability of the system when a group of disks have problems. It is easier to design and implement. Each read disk can only read one piece of data, that is, the transfer rate of the data block is the same as the read rate of the independent disk. Because raid1 has complete verification, it has a great impact on the system's processing capability. The raid function is usually implemented by software, such an implementation method will greatly affect the server efficiency when the server load is heavy. When your system requires high reliability, such as data statistics, it is more appropriate to use raid1. In addition, raid1 supports "Hot replacement", that is, the faulty disk is replaced when the power is continuously switched. After the replacement, you only need to recover data from the mirror disk. When the primary hard disk is damaged, the image hard disk can work instead of the primary hard disk. An image hard disk is equivalent to a backup disk.Hard Disk mode is very secure, raid
1 Data security is the best at all RAID levels. However, its disk utilization is only 50%, which is the lowest among all RAID levels.
  
Raid2
: Verification with Haiming code
In terms of concept, raid 2 is similar to raid 3.Data is distributed in blocks on different hard disks, measured in bytes or bits.. However, raid 2 uses a certain encoding technology to provide error detection and recovery. This encoding technology requires multiple disks to store inspection and recovery information, making raid 2 more complicated. Therefore, it is rarely used in commercial environments. Each disk on the left is a bit of data. The Hamming checkcode obtained by the bitwise operation of one data can be saved to another disk. For details, see. Due to the characteristics of Haiming code, it can correct errors in case of data errors to ensure that the output is correct. Its data transmission rate is quite high. If you want to achieve an ideal speed, you 'd better increase the hard disk that saves the verification code ECC. For the controller design, it is better than raid3, 4 or 5 should be simple. There is no free lunch, and the same is true here. To use the Hamming code, you must pay the cost of data redundancy. The output data rate is the same as the slowest speed in the drive group.

  Raid3: Parallel transmission with parity Codes
This verification code is different from raid2. It can only be used for error detection and cannot be used for error correction. It processes a zone at a time when accessing data, which can improve the reading and writing speed. It stores data in parallel like RAID 0, but the speed is not faster than RAID 0. The verification code is generated when data is written and saved on another disk. You must have more than three drives for implementation. The write speed and read speed are both high. Because the check bit is relatively small, the calculation time is relatively small. It is very difficult to implement raid control using software, and the implementation of the controller is not very easy. It is mainly used for graphics (including animation) and other scenarios that require high throughput. Unlike raid 2 and RAID 3Use a single disk for parity StorageInformation. If a disk becomes invalid, data can be regenerated on the parity disk and other data disks.
For exampleIf the parity disk is invalid.RAID 3 provides a good transfer rate for a large amount of continuous data, but for random data, the parity disk will become the bottleneck of write operations.. Although the security of data without images is high, the hard disk utilization is greatly improved by using a separate verification disk, which is n-1.
  Raid4: Independent disk structure with parity code
Raid4 and raid3 are very similar. The difference is that they access data by data block, that is, by disk, each time it is a disk. As shown in the figure,Raid3 is a horizontal line at a time, while raid4 is a vertical line at a time. It also features raid3, which is much more difficult than raid3 in the case of failure recovery.It is much more difficult to design, and the data access efficiency is not very good.

  
RAID5
: Independent disk structure of distributed parity
It can be seen from itsThe parity code exists on all disks.Where P0 represents the parity value of The 0th band, and the other meanings are the same.The reading efficiency of RAID5 is very high, the writing efficiency is average, and the block-based collective access efficiency is good.. Because the parity code is on different disks, the reliability is improved,Error allowed for a single disk. RAID 5 also ensures data security by the Data check bit, but it does not store the data check bit on a separate hard disk, but stores the check bit of the Data Segment on each hard disk. In this way,If any hard disk is damaged, the damaged data can be rebuilt based on the check bit on other hard disks. The hard disk usage is n-1.
But itIt does not solve the data transmission concurrency well, and the controller design is also quite difficult.. An important difference between RAID 3 and RAID 5 is that each data transmission of RAID 3 involves all array disks. For RAID 5, most data transmission only operates on one disk and can be performed in parallel. There is a "Write loss" in RAID 5, that is, each write operation will generate four actual read/write operations, two of which read the old data and parity information, write new data and parity information twice.

  Raid6: Independent disk structure with two parity codes for Distributed Storage
The name is very long, but if you see the figure, you will immediately understand why. Note that P0 represents the parity value of The 0th band, while PA represents the parity value of data block. It is an extension of raid5. it is mainly used to ensure that data cannot be erroneous. Of course, because the second parity value is introduced, N + two disks are required. At the same time, the design of the controller becomes very complicated and the write speed is not good, it takes a lot of time to calculate the parity value and verify the data correctness, resulting in unnecessary load. I think no one except the military can afford this kind of thing.

  Raid7: Optimized high-speed data transmission disk structure
All I/O transfers of raid7 are synchronized and can be controlled separately. This improves the system concurrency and the speed at which the system accesses data. Each disk has a high-speed buffer storage, real-time Operating Systems can use any real-time operating chip to meet the needs of different real-time systems. The SNMP protocol can be used for management and monitoring, and an independent transfer channel can be specified for the verification area to improve efficiency. Multiple hosts can be connected. Because high-speed buffer storage is added, when multiple users access the system, the access time is almost close to 0. Because of the parallel structure, data access efficiency is greatly improved. It should be noted that it introduces a high-speed buffer memory, which has advantages and disadvantages, because once the system is powered off, all the data in the high-speed buffer memory will be lost, so it needs to work with the ups. Of course, things that are so fast are expensive.

  Raid10: High reliability and efficient disk structure
This structure is nothing more than a band structure plus a mirror structure. Because the two structures have their own advantages and disadvantages, they can complement each other to achieve both high efficiency and high speed. You can understand this new structure based on the advantages and disadvantages of the two structures. This new structure is expensive and not scalable. It is mainly used in databases with small capacity but requires speed and error control.

  Raid53: Efficient data transmission disk structure
The later structure is a type of repetition and reuse of the previous structure. This structure is the unity of raid3 and the band structure. Therefore, it is fast and has a fault tolerance function. However, the price is very high and it is not easy to implement. This is because all data must pass through the band and by-bit storage methods. Considering the efficiency, it is not easy to require these disks to be synchronized.
  Raid0 + 1:
Combine raid0 and raid1, that is, raid0 + 1. In addition to data distributed across multiple disks, each disk has its physical image disk, providing full redundancy capability, allowing one or more disks to fail without affecting data availability, and has the ability to read/write quickly. At least four hard disks are required to be used as RAID 0 + 1.

  (3) jbod Mode
Jbod is also called span. It logically connects several physical disks one by one to form a large Logical Disk. Jbod does not provide fault tolerance. The size of the array is equal to the total capacity of all disks that constitute the span. Strictly speaking, jbod is not within the raid range. However, many ide raid control chips now adopt one mode. jbod is a simple overlay of hard disk capacity, but the system does not adopt parallel processing, when writing data, it is a hard disk written first, full and then a second hard disk written ......
In practical application, the most common is raid0 raid1 RAID5 and raid10. In most cases, RAID5 contains the advantages of the RAID2-4, so the RAID2-4 is basically out of the market.
Now, it is generally believed that RAID2-4 is only used for raid Development Research
  (4) Available ide raid
The above is a description of the raid principle, and our pcfans are most concerned with raid applications. We use an IDE hard drive on a daily basis, and we can easily buy an ide raid card and a motherboard that integrates raid chips. Therefore, the closest thing to us is ide raid. Due to the low application level, most ide raid only supports RAID 0, RAID 1, RAID 0 + 1, and jbod modes.

Raid applications

At the beginning, the raid solution mainly targets the SCSI hard drive system, and the system cost is relatively high. In 1993, highpoint company launched the first IDE-RAID control chip, can use relatively cheap IDE hard disk to build a RAID system, thus greatly reducing the raid "threshold ". Since then, individual users have begun to pay attention to this technology, because hard disks are the most "slow" and least secure devices in modern personal computers, the data stored by users often far exceeds the computer's own price. When the cost is relatively small, RAID technology can enable individual users to enjoy doubled disk speed and higher data security, the IDE-RAID control chips on the PC market are mainly from highpoint and promise, and some are from AMI companies (such
2 ).
IDE-RAID chips for individual users generally only support RAID 0, RAID 1 and RAID 0 + 1 (RAID 10) raid specifications, although they are technically incomparable with commercial systems, the speed improvement and security assurance provided by ordinary users are sufficient. With the continuous improvement of hard disk interface transmission rate, IDE-RAID chip is constantly updated, mainstream chip on the chip market has all supported the ATA 100 standard, highpoint's new HPT 372 chip and promise's latest pdc20276 chip can even support the ATA 133 standard IDE hard drive. In today's increasingly competitive motherboard manufacturers and increasingly demanding personal computer users, there are no more vendors that carry raid chips on the motherboard.
RAID card, directly build your own disk array, feel the speed of the disk.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.