Individual RAID comparisons: performance and speed and read-write aspects (RAID 1 read and single disk no difference)

Source: Internet
Author: User
The comprehensive performance aspect (data security and speed aspect), certainly is the RAID5 is good, the data reads aspect, the RAID1 is fastest, the data security aspect, the RAID1 is best, the data writes aspect, RAID0 fastest. RAID5 combines these advantages.

Understanding Disk Array RAID

First, function
1 High speed access to disk (speed-up): Raid makes a common hard drive into a disk array, writes the data in the host, the RAID controller decomposes the data which the host writes to the data block, then writes in parallel to the disk array; When the host reads the data, the RAID controller reads the data that is scattered on each hard drive in the disk array in parallel. , and then regroup them and provide them to the host. The access speed of storage system is improved by using parallel read-write operation.
2 expansion

3 Data redundancy

Ii. classification

RAID can be divided into levels 0 through Level 6, commonly known as: Raid0,raid1,raid2,raid3,raid4,raid5,raid6.

RAID0:RAID0 is not a real RAID structure, without data redundancy, RAID0 continuously splits data and reads/writes to multiple disks in parallel. So it has a high data transmission rate, but RAID0 improves performance without providing data reliability, and if one disk fails, it affects the entire data. Therefore, RAID0 cannot be applied to critical applications that require high availability of data.

RAID1:RAID1 data redundancy through data mirroring, generating data that is backed up on two pairs of separate disks. RAID1 can improve read performance and can read data directly from the mirror when the raw data is busy. RAID1 is the most expensive disk array, but provides the highest data availability rate. When a disk fails, the system can automatically swap to a mirrored disk without the need to reorganize the failed data.

RAID2: Conceptually, RAID2 is similar to RAID3, both of which distribute data across different hard disks, in bits or bytes. However, RAID2 uses a coding technique called "Weighted average error correction code" to provide error checking and recovery. This coding technology requires multiple disk storage checking and recovery information, making RAID2 technology more complex to implement. Therefore, it is rarely used in a business environment.

RAID3: Unlike RAID2,RAID3, which uses a single block of disk to store parity information. If a disk fails, the parity disk and other data disks may be able to produce the data again. If the parity disk fails, the data usage is not affected. RAID3 for a large number of continuous data can provide a good transmission rate, but for random data, odd and even disk will become a bottleneck in the write operation.

RAID4: Like RAID2 and RAID3, RAID4 and RAID5 also partition data and distribute it on different disks, but the bar units are blocks or records. RAID4 use a disk as a parity disk, each write operation needs to access the odd and even disk, the write operation is a bottleneck. Rarely used in commercial applications.

Instead of a separate parity disk, the RAID5:RAID5 accesses data and parity information on all disks. On the RAID5, the read/write pointer can operate on the array device at the same time, providing higher data traffic. RAID5 is more suitable for small data blocks, random read and write data. The important difference between RAID3 and RAID5 is that all of the array disks need to be involved in each transmission of the RAID3. For RAID5, most data transfers are performed on a single disk and can be operated in parallel. There is "write loss" in the RAID5, that is, each write operation will produce four actual read/write operations, of which two read the old data and parity information, two times to write new data and parity information.

Raid6:raid6 adds a second independent parity information block compared to RAID5. Two independent parity systems use different algorithms, and the reliability of the data is very high. Even if two disks fail at the same time, it will not affect the use of the data. However, more disk space needs to be allocated to parity information, and there is a greater "write loss" relative to the RAID5. RAID6 write performance is very poor, poor performance and complex implementation makes RAID6 rarely used.

Third, Detailed introduction

RAID0 is the purpose of raising speed and expanding

In RAID0 mode, the data is segmented into a number of data blocks (Chunk) interleaved on multiple hard drives, in general, the number of pieces of data being split in the RAID0 system is related to the number of hard disks used by the RAID array, such as the 3 hard drives used in the RAID0, Then the data will be divided into three copies in sequence to write three hard drives, the popular saying that this model is actually using RAID technology to make the system think that three hard drives to form a larger capacity of the hard drive, because the process is not data validation so this RAID mode is the fastest read-write.

RAID0 is not considered in terms of security, in fact, if a hard drive in the RAID0 is broken, all data is corrupted and there is no way to recover. This makes RAID0 security very poor, so many users do not use RAID0 mode for security reasons. Nonetheless, RAID0 is, after all, the fastest mode of all raid modes, and if there are two hard drives in the RAID0 mode, then the RAID0 store reads the data twice as fast as a single hard drive. , if you use 6 hard drives, the theoretical rate is 6 times times that of a single hard drive. If using a different hard disk in RAID0 mode can cause two problems, first of all, RAID0 's effective hard disk capacity will be the smallest hard disk capacity to multiply the number of hard drives, because if the minimum capacity of the hard drive is full, RAID0 will still distribute the file evenly to each hard drive, The storage task cannot be completed at this time; second, if the speed of the hard disk in the RAID0 is different, the overall speed will be the speed of the slowest hard drive to multiply the number of hard drive, because the RAID0 mode is required to complete the last storage task before the next process, so that Other fast hard drives will stop to wait for slow hard drives to complete storage or read tasks, which can degrade overall performance. Therefore, it is recommended to use the RAID0 mode in the user's best choice of capacity and speed of the same hard disk, preferably the same brand of the same product.

Therefore, RAID0 is not strictly a "redundant independent disk array". RAID0 mode is generally used for situations where data needs to be processed quickly but not for data security requirements. This raid mode is characterized by simplicity and does not require a complex and expensive controller. Using the RAID0 mode requires at least 2 hard drives, and the resulting storage capacity is the same as the two hard drives.

RAID0 Random Read performance: Very good
Random write performance for RAID0: Very good
RAID0 Continuous Read performance: Very good
Continuous write performance for RAID0: Very good

RAID0 Advantage: The fastest read and write performance, if each disk has a separate controller performance will be better.

RAID0 's disadvantage: any one hard drive failure all the data will be lost, most of the controller is implemented through the software, so the performance is not good.

RAID1

The RAID1 mode is to allow the hard disk that makes up the RAID1 mode to mirror each other, and when you write data to the hard disk, two hard drives store the same data at the same time, so that even if one of the hard drives fails, the system works as if it were another hard drive. RAID1 is better at reading performance than a single hard disk because the RAID controller can read the same data on another hard drive when a hard drive is in a busy state, but the write data performance does not grow and may fall slightly. When one of the hard drives fails, the new data can be written to a hard drive that is still working, and the RAID controller automatically copies the data to the new hard disk after replacing the original hard drive with the new hard drive. The biggest feature of the RAID1 pattern is the high redundancy, but since most of the functionality is implemented using software, it increases the burden on the processor. This RAID mode is ideal for people with extremely high data security requirements.

In RAID1 mode, the hard disk used is the same, otherwise there will be a waste of hard disk space. Because the RAID1 mode is to write the same information to a different hard disk, the effective hard disk capacity of the RAID1 mode is the capacity of the smallest hard disk in the array. For example, if the RAID1 mode has a hard disk with a capacity of 20GB and a hard disk with a capacity of 30GB, the effective capacity of the overall RAID1 is 20GB, and the remaining 10GB capacity on that 30GB hard drive is wasted. At the same time, if the speed of two hard drives is different, then the faster hard drive will still stop to wait for the slower hard drive to complete the task before taking the next step.

Random read performance of RAID1: good
Random write performance for RAID1: good
Continuous read performance of RAID1: General
Continuous write performance for RAID1: good

RAID1 Advantages: The data is high reliability, easy to implement and simple to design.

RAID1 's disadvantage: slower than RAID0, especially the write speed, and the other is that we can only use half of the hard disk capacity.

Raid0+1

This raid mode is actually a combination of RAID0 and RAID1 modes, requiring at least 4 hard drives. Any of these two pieces form a RAID0 disk array, and then two RAID0 disk arrays can be viewed as two larger, faster hard drives that form a RAID1 disk array. Such a system ensures higher disk performance and higher data security. Of course, the disadvantage is also obvious is that the cost is higher, the structure is more complex. Raid0+1 is second only to RAID5 in fault-tolerant performance, and is generally used in file servers.

Raid0+1 Random Read performance: Very good
Random write performance for raid0+1: good
Raid0+1 Continuous Read performance: Very good
Continuous write performance for raid0+1: good

The advantage of raid0+1 is that it has higher reading and writing performance compared with single block hard disk, and greatly improves the security of data.

Raid0+1 's disadvantage: high cost, at least 4 hard drives required.

RAID2

RAID2 mode is also quite complicated, the hard disk used to store the data is combined with the RAID0 mode, plus the hard disk which is specially storing hamming ECC checksum, of course, in order to improve the security of the check code data, the parity hard disk is composed of at least two RAID1 modes. This allows the RAID controller to recover data to a new hard disk by using the sea-plaintext, even if one of the hard drives that stores the data is damaged. RAID2 generally for large data manipulation and super computer applications, but not suitable for ordinary users. Because the checksum code is generated during the data store, the performance of this disk array is not high. For a variety of reasons this disk array pattern has not been put into actual commercial applications. Because the price is not expensive, of course, will not be accepted by ordinary users.

Random read performance of RAID2: General
Random write performance for RAID2: poor, mainly because all operations have to go through ECC
RAID2 Continuous Read performance: Very good
Continuous write performance for RAID2: General

RAID2 Advantages: High data security, as long as the hard drive to store the check code can recover the data without failure.

Disadvantages of RAID2: expensive, need special hard disk storage check code, inefficient, no commercial application support.

RAID3

As with the RAID2 mode, the RAID3 data is then stored on multiple hard drives by dividing into chunks of data. Just RAID3 the data to bits and store it on each hard disk. It has the advantage of having high speed reading and writing, and of course write performance because of the need to generate parity code during the write process, the speed will be affected-it also needs a dedicated hard disk to store parity code. When one of the hard drives that stores the data fails, the system still works, but performance is affected, and if a hard drive fails before a bad hard drive is replaced, the data for that disk array will be lost and unrecoverable. In this type of disk array mode, requires all the hard disk speed to synchronize, this requirement in practical application is not small. RAID3 requires at least 3 hard drives, one of which is used to store parity codes-parity codes are obtained by XOR or operation.

This RAID mode, if implemented using a software controller, will significantly affect performance because the combination is complex, but it can be achieved at least 3 hard drives compared to the raid0+1 mode-so the cost is reduced, and the disk array is generally more suitable for video processing and editing applications.

Random read performance of RAID3: good
Random write performance for RAID3: Very poor
RAID3 Continuous Read performance: Very good
Continuous write performance for RAID3: General

RAID3 Advantages: More suitable for video editing, such as the need for large amounts of data to call the occasion.

The disadvantage of RAID3: it is very difficult to realize the speed synchronization of each drive (most of the hard drives do not support this feature at present) and require a complex controller.

RAID4

RAID4 mode is almost the same as RAID3, data is divided into small chunks of data stored in sequence on more than one hard disk, parity check code stored on a separate parity disk. The only difference is that on the data partition the RAID3 is in bit and RAID4 is in byte. This allows the RAID4 to have the same read speed as the RAID3, which, of course, is affected by the need to generate a checksum in the write process and to store it in the checkout disk.

The greatest benefit of this pattern is that it does not require synchronization between the hard drives on the rotational speed, which makes the controller less complicated. Its write performance is the worst of all raid modes. As with RAID3 mode, when one of the hard drives is damaged, the data is not lost, and if the failed disk is replaced, the second hard disk failure will cause all the data to be lost. Compared to other RAID modes, the recovery of the data on the failed hard drive is quite inefficient.

This disk array mode also requires at least 3 hard drives to build. Parity-check codes are obtained by XOR or operation. It is suitable for general applications, including video processing applications. Its cost is not high, because as long as a hard disk as a check code disk can be.

RAID4 Random Read performance: Very good
Random write performance for RAID4: Generally, primarily because of the ability to write parity to parity disks
Continuous read performance of RAID4: good
Continuous write performance for RAID4: General

Advantages of RAID4: In addition to the advantages of RAID3, it does not require synchronous drive speed.

Disadvantages of RAID4: Poor write performance, high controller requirements.

RAID5

RAID5 uses at least three hard drives to implement the array, it can realize the RAID0 acceleration function and realize the RAID1 backup data function, when there are three hard drives in the array, it will divide the stored data according to the user defined partition size into the file fragment to store into the two hard disk, at this time, The third hard drive in the array does not receive file fragments, and it receives a portion of the data that is used to verify the data stored on the other two hard disks, which is generated by a certain algorithm that restores data stored on the other two hard disks. In addition, the task of these three hard drives is not immutable, that is, in this storage may be 1th hard disk and 2nd hard disk to store fragmented file fragments, then the next time storage may be 2nd hard disk and 3rd hard drive to complete this task. It can be said that in each storage operation, the task of each hard disk is randomly assigned, but it must be two hard drives to store fragmented file fragments another hard disk to store the checksum information.

This checksum is typically done through a RAID controller, which usually requires a separate chip on a RAID controller to compute and decide which hard disk storage to send this information to. RAID5 will also implement RAID0 high speed storage and RAID1 data recovery, that is, in the case above, RAID5 can use three hard drives simultaneously realize RAID0 speed doubling function also can realize RAID1 data backup function, And when a hard drive in the RAID5 is damaged, adding a new hard drive can also restore the data.

RAID5 is one of the most complex controller designs in several RAID modes that we have described so far. RAID5 can be applied in most areas, such as multiuser and multitasking environments. Many of today's Web servers and other Internet servers use this form of disk array, such as the recently launched Quantum snap server, which uses an external RAID5 disk array design. Parity typically occupies about 33% of the disk space, so for a RAID5 disk array with a total capacity of 120GB, the available space will be about 80GB. However, this disk array mode does not support the general motherboard process's RAID controller, such as the Abit Kr7a-raid motherboard, which only supports RAID0, RAID1, and raid0+1. Of course, as long as the use of parity code, will affect the writing performance to a certain extent, so many disk array vendors in the disk array to add write cache to improve write performance.

RAID5 mode is not all right, if the information on one of the disks in the array has changed, then you need to recalculate the fragmentation of the file, and the checksum needs to be recalculated, when three hard drives need to be called again. Similarly, if you want to do a RAID5 array, it is best to use a hard disk of the same capacity with the same speed, the effective capacity of the RAID5 mode is the number of the disk in the array with the smallest capacity of the hard disk in the row, minus one, the number of the hard disk is minus one because one of the hard drives is used for checking information.

RAID5 Random Read performance: Very good (when using large blocks of data)
Random write performance of RAID5: general, but better than RAID3 or RAID4
RAID5 Continuous Read performance: Good (when using small blocks of data)
Continuous write performance for RAID5: General

Advantages of RAID5: No special checkout code disk, fast reading speed, and solve the problem of relatively slow write speed.

RAID5 's disadvantage: Write performance is still unsatisfactory.

RAID6

RAID6 is a new technology in the raid family and is extended on a RAID5 basis. So, like RAID5, data and checksum codes are divided into blocks of data and stored separately on each hard disk of the array. RAID6 has added a separate checksum disk that backs up all the checksum codes distributed across the disks so that the RAID6 disk array allows multiple disks to fail simultaneously, which is necessary for applications where data security is highly demanding. This will require a minimum of 4 hard drives to build a RAID6 disk array. However, RAID6 does not improve the RAID5 write performance, the application of write caching can only be a certain extent to compensate for this shortcoming but can not fundamentally solve the problem. Because both RAID5 and RAID6 can change the size of a block of data according to the application, its actual performance is also affected by this factor.

The application of RAID6 in practical applications is not as extensive as other raid modes. If this feature is typically designed to be a more complex and expensive RAID controller, it will generally not be integrated on the motherboard.

RAID6 Random Read performance: Good (when using large blocks of data)
RAID6 's random write performance: poor, because not only does the checksum data be written on each hard drive, but writes the data on a dedicated checksum hard drive
RAID6 Continuous Read performance: Good (when using small blocks of data)
Continuous write performance for RAID6: General

Advantages of RAID6: Fast read performance, higher fault tolerance.

Disadvantages of RAID6: Slow write speeds, and RAID controllers are more complex and more expensive to design.

Heat exchange and thermal redundancy

In a RAID system, heat exchange and thermal redundancy are generally the capabilities. Hot swap allows you to replace the failed hard drive without shutting down the system or power supply, and of course the new hard drive can be identified by the system dynamically and configured and added correctly without restarting the computer. The benefit of this is undoubted, for maintenance personnel is very simple, and for many applications, such as Web servers, users do not want the server downtime, the loss will be immeasurable. Many Hp/dell server products and RAID disk arrays have the ability to exchange heat.

Thermal redundancy is generally used in situations where heat exchange is not suitable. This design is usually in the computer before the failure to configure an additional hard drive, when there is a hard drive failure, this redundant can automatically replace the failure of the hard disk location, for such a system before the system is closed can not pull down the damaged hard drive. Thermal redundancy is less convenient than heat exchange, but it is better than nothing.

Summary

In fact, there are many types of disk arrays, today we are introducing some basic application patterns that can be used with a variety of RAID modes in practical applications to achieve sufficient performance and stability, and, of course, for RAID controllers, and disk array systems are more expensive.

The raid system used by the server is generally SCSI based, so the cost of the raid systems will be even more expensive. In fact, this function for our personal applications still have a certain distance, even if you have a motherboard that incorporates a RAID controller, you will need at least 2 hard drives (typically the two hard drives are the same on capacity, brand, speed), which is a huge expense for individual users. Of course, if you have special needs, such as the need to assume a workstation or Web server, but do not want to spend too much money, then IDE RAID is a good choice. Here's a reminder that the average onboard IDE raid has a high processor footprint, and IDE raid is less than a SCSI hard drive in some applications.

==========================================

3.1.2 Raid Summary

An analysis comparison of RAID 0, RAID 1, RAID 5, and RAID 10 is made, as shown in table 3-1.

Table 3-1 RAID 0, RAID 1, RAID 5, and RAID 10 analysis comparison

Type

Read and write performance

Security

Disk Utilization

Cost

Application aspects

RAID 0

Best (increased by parallelism)

Worst (full

No safety guarantee)

Highest (100%)

Minimum

Individual users

RAID 1

There is no difference between reading and a single disk,

Write to write on both sides

Highest (Supply number

According to the 100% backup)

Difference (50%)

Highest

Suitable for storing important

Data, such as servers and

Database storage and other fields

RAID 5

READ: RAID 5=raid 0

(Similar data reading speed)

Write: RAID 5< on a single

Disk for write operations

(More than one parity information write)

RAID 5<

RAID 1

RAID 5>

RAID 1

RAID 5<

RAID 1

is a storage performance,

Data security and storage costs

A Balanced storage solution

RAID 10

READ: RAID 10=raid 0

Write: RAID 10=raid 1

RAID 10 =
RAID 1

RAID 10 =

RAID 1 (50%)

RAID 10 =

RAID 1

Set up RAID 0, raid

1 of the advantages, but space on

Because mirroring is used without

is a RAID 5-like "odd

I-Check information ", Disk

The utilization rate is 50%.

Reference: "Deep mining: Oracle RAC Database Architecture analysis and Combat Strategy" 3rd chapter storage and network transmission

on the reading and writing problem of RAID1

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.