RAID, abbreviated for redundant Arrays of independent disks, Chinese as an inexpensive redundant disk array.
A Reasons for the occurrence (advantages of RAID):
- Its purpose is mainly server-oriented, but now the personal computer due to the need for a few hard drives, and the default is the first hard disk cache, and the other is not, so that the computer access to other hard disk speed is particularly slow, then the disk array technology appears, Combine small, inexpensive disks to replace large, expensive disks to coordinate access to several hard drives.
- In many raid modes, there are more complete mutual check/restore measures, even direct mirror backup, which greatly improves the fault tolerance of the raid system and improves the stability redundancy of the system.
Two. Deficiencies (the disadvantage of RAID)
Do different raid, have RAID mode hard drive utilization is low, expensive.
Three. Classification of RAID
1.RAID0 (also known as Stripe or Striping) data sub-strip technology, that is, the file to write when a RAID is made, the files are cut to the same size data bar, and then the data bars are stored sequentially on different disks.
1: The I/O data requests made by the system to a logical hard disk (RADI 0 disk group) consisting of three disks are translated into 3 operations, each of which corresponds to a physical hard disk. We can clearly see that by establishing RAID 0, the original sequence of data requests is dispersed across all three drives simultaneously.
In theory, the parallel operation of the three drives increases the disk read and write speed by 3 times times over the same time. The more hard disk, the faster the transmission speed, but due to the bus bandwidth and other factors, the actual rate of increase will certainly be lower than the theoretical value, but, a large number of data parallel transmission and serial transmission comparison, speed-up effect obviously undoubtedly.
The disadvantage of RAID 0 is that data redundancy is not provided because the data is striped, so as long as a hard drive is hung, a portion of the file's data bar is lost and the entire file is corrupted. So once the user any one hard disk damage, the corrupted data will not be restored.
RAID 0 has features that make it particularly suitable for areas where performance requirements are high, and data security is not too much to care about, form workstations or other applications that require high performance for hard drives. The sense that raid 0 applies to the extreme, the enterprise application server is rarely used, in addition, because the same model, the size of the hard drive together will be more embodiment, and the effect is not as SSD, so raid 0 on the home computer is also slightly untrue.
Number of HDD requirements: 2 or more than 2
HDD Utilization: Highest utilization, sum of all disk capacities, 100%
Fault tolerance: Minimum fault tolerance, 0
2.RAID 1
(also known as
Mirror
or
Mirroring
) Mirroring Technology
A copy of the data sent to RAID 1 is divided into two shares, and written to each disk, because the same data will be written to other different disks, so if you want to write to 100MB, data transfer to the I/O bus will be copied more copies to the disks, The result is that the data volume feels bigger! Therefore, in the case of large write RAID-1, the performance of the write may become very poor (because we only have one South bridge!). )。 Fortunately, if you are using a hardware RAID (disk array card), the disk array card will actively replicate a copy without using the system's I/O bus, performance is also possible. If you use a software disk array, you may not be able to function well.
Because the data in the two hard drives is exactly the same, your data can remain intact when any one of the hard drives is damaged! So we can say that the biggest advantage of RAID 1 is the backup of the data! However, because half of the disk capacity is used for backup, the total capacity will be half the capacity of all disks. Although the write performance of RAID-1 is poor, the reading performance is OK! This is because two copies of the data are on different disks, and if multiple processes are reading the same data, RAID will get the best read balance on its own.
2: When reading the data, the system first reads the data from the RAID1 source disk, if the reading data is successful, the system does not take care of the backup disk data, if the reading of the source disk data fails, the system automatically read the data on the backup disk, will not cause the user's work task interruption. Of course, we should replace the damaged hard drive in time and re-establish the mirror with the backup data to avoid the irreparable loss of data in the event of damage to the backup disk.
HDD Requirements: 2 + 2*n (n≥0) (most controllers only support 2 disk pure RAID 1)
HDD Utilization: Lowest utilization, half 50% of all disk capacity
Fault tolerance: The most fault tolerant, allowing bad half of the hard drive.
P.S. Once communicated with Dell engineers, Dell's engineers said that if the 4 disk to consider pure RAID1, then only one disk capacity can be used, the other disk is mirrored disk, then fault tolerance is at the same time allow the N-1 block disk bad, but in fact, Dell and HP array card in the choice of raid 1 mode is only 2 blocks can be selected, not allowed to choose more (╯ '-') ╯︵┻━┻
3.RAID 1+0 RAID 0+1
RAID-0 performance is good but the data is not safe, RAID-1 data security but inefficient, then can you integrate the two to configure RAID? Yes, you can! That's the raid 0+1 or RAID 1+0. The so-called Raid 0+1 is: (1) The first two disks to form a RAID 0, and such a configuration has a total of two groups, (2) The two sets of RAID 0 to form a group of RAID 1. This is RAID 0+1! Conversely, RAID 1+0 is the first to make up RAID-1 and then make up the RAID-0 meaning.
HDD Demand Quantity: 4 + 2*n (n≥0)
Hard Drive utilization: Low utilization, half of all disk capacity, 50%
Fault tolerance: tolerance is acceptable, allowing half of the disk to be damaged at the same time, but as to how the damage will affect raid depends on whether it is using raid1+0 or RAID 0+1.
4.RAID5, Raid 6
As shown, each loop is written with a portion of the same check code (parity) recorded, and the recorded same check code is recorded on a different disk each time, so any one disk can be destroyed by the other disk check code to rebuild the original disk data Oh! It should be noted, however, that the total capacity of RAID 5 will be reduced by one for the overall disk size, since there is the same check code. For example, the original 3 disk will only be left (3-1) = 2 disk capacity. And when the number of damaged disks is larger than two, the entire RAID 5 data is destroyed. Because RAID 5 defaults to support only one disk's damage.
In the comparison of reading and writing performance, reading performance is not bad! Compared to RAID 0! However, the performance of writing can not be added a lot! This is because the data to be written to RAID 5 also has to be computed in relation to the same-bit check code (parity). By adding this computational action, the performance of the write is much more related to the hardware of the system! In particular, when using a software disk array, the same check code is calculated through the CPU rather than the full-time disk array card, so the performance needs to be evaluated.
In addition, because RAID 5 can only support the damage of one disk, so recently developed another level, that is, RAID 6, this RAID 6 uses two disk capacity as parity storage, so the overall disk capacity will be less than two, but the number of disks allowed to error can reach two! That is, in the case of RAID 6, while the two disks are damaged, the data can be saved back!
HDD Demand Quantity: N (n≥3)
Hard disk Utilization: The utilization is fair, the utilization is N-1 block disk capacity
Fault tolerance: Fault tolerance, all disk disk bad one, if RAID6, allow all disk bad two, but also the utilization is N-2
Four hot spare disk
When the bad disk appears, the server will call the police, manual can manually replace the bad disk, when the good disk, the RAID controller will be based on XOR, the three disk information in the background to calculate, and then write to the new hard disk. Some RAID control systems use NVRAM to hold array information, so the striped data associated with the entire column is reassigned to the new disk, which is transparent to both the user and the application. What the user may feel is that the system will delay the server response due to the redistribution of the stripe data in the background.
Of course, if the conditions allow, the user can also make a strong redistribution, of course, in the maintenance of downtime operation.
Well, as you can see from the basics above, if there's a bad disk in the RAID 5 array, the data will not be lost, even if you don't have to pay attention to him (and of course the performance of the system is going to drop a lot).
But because of the limitations of XOR and the level of RAID 5, if there is a problem with two disks in a row, the data in the entire array will be ruined and the system won't be available.
In particular, those unattended room, RAID 5 dead a disk, no relationship, if they are unlucky, when the manager did not replace the bad disk, the second hard disk is dead, then it is all finished.
Hot-spare Disk is a non-participation in the disk array, but the power-on-line disk, a bit of RAID disk problems, it can be automatically replaced into the disk array, you can think of it as a "auto-change" concept.
RAID and Hot spares detailed