1.1 raid was proposed by Professor D. A. Patterson of the University of California at Berkeley in 1988. Raid is the abbreviation of redundent array of Inexpensive Disks. It is also called "disk array ". Later, the letter I in raid was changed to independent, and raid became an "independent redundant disk array". However, this is only a name change and the substantive content has not changed. Raid can be understood as a method to use disk drives. It connects a group of disk drives in a logical way and is used as a logical disk drive. Generally, the capacity of a logical disk drive is smaller than the total capacity of each disk drive. The specific implementation of RAID can depend on hardware or software. The windows NT operating system provides the software raid function. Raid is generally implemented on a SCSI disk drive, because the performance of the IDE disk drive is limited by the IDE interface (the IDE can only connect two disk drives with a transmission rate of up to 1.5 Mbps ). The IDE channel can only connect to up to four disk drives. At the same time, only one disk drive can transmit data. In addition, the IDE channel is usually equipped with an optical drive, the delay caused by the optical drive will seriously affect the system speed. The SCSI adapter ensures that each SCSI channel is smooth at any time. At the same time, each SCSI disk drive can freely transmit data to the host, without competing with the device channel like an IDE disk drive.
1.2 raid advantages
1.2.1 low cost, low power consumption, and high transmission rate. In raid, many disk drives can transmit data at the same time, and these disk drives are logically a disk drive, therefore, using raid can speed up to several times, dozens, or even hundreds of times of a single disk drive. This is also the problem raid originally wanted to solve. At that time, the CPU speed was growing rapidly, and the data transmission rate of the disk drive could not be significantly improved. Therefore, a solution was needed to solve the conflict between the two. Raid is successful.
1.2.2 fault tolerance is supported. This is the second reason for raid, because common disk drives cannot provide Fault Tolerance functions, if not including the CRC (cyclic redundancy check) code written on the disk. Raid and fault tolerance are built on the hardware fault tolerance function of each disk drive, so it provides higher security.
1.2.3 another feature of raid is the data check (parity) function. The check can be described as additional information for RAID levels 2, 3, 4, and 5. When the disk fails, the verification function combines the data in the intact disk to reconstruct the data on the invalid disk. For raid systems, absolute data integrity is the most basic requirement under any harmful conditions. Data integrity refers to the ability of the array to keep data intact when the disk fails. Data destruction usually has disastrous consequences, therefore, the basic condition for selecting a raid array is what level of data integrity it can provide.
1.2.4 compared with traditional large-diameter disk drives, raid is much lower in the same capacity.
Raid Classification
Level explanation
Raid level 0 (stripe) disk array without redundancy and Verification
Data is distributed across various disk drives at the same time, with no fault tolerance capability and the fastest read/write speed in raid. However, if any disk drive is damaged, the entire raid system will become invalid, therefore, the security factor is lower than that of a single disk drive. It is generally used in scenarios with low data security requirements but high speed requirements.
Raid Level 1 (mirror) mirror disk array
Each disk drive has an image disk drive, which is consistent with the content of the original disk drive at any time. Raid1 has the highest security, but only half of the disk space is used to store data. It is mainly used in scenarios where data security requirements are high and damaged data needs to be quickly restored.
What if RAID 1 + 0 writes data from RAID 0 to two hard disks and then creates two images? This is the Raid 1 + 0 solution. Raid 1 + 0 must use at least four hard disks. Therefore, RAID 1 + 0 theoretically guarantees the performance of RAID 0 and the security of RAID 1, the cost is twice the number of hard disks than RAID 0 or 1. However, it should be noted that this is only theoretical, because in practice, software raid systems such as IDE raid consume the CPU computing time. Raid 1 + 0 is less than RAID 0 or 1, the CPU time is also doubled, so the performance may not be increased to the ratio of RAID 0 in the end, or even the overall performance may not rise or fall.
Raid Level 2 correction Haiming code disk array
The first, second, and fourth in the disk drive group ...... The 2n disk drive is a dedicated disk for verification and error correction, for example, raid 2 for seven disk drives, the first, second, and fourth disk drives are error correction disks, and the rest are used to store data. The more disk drives are used, the fewer percentage of disk verification. Raid 2 provides high performance for input and output of large amounts of data, but it does not provide good performance for input and output of a small amount of data. Raid2 is rarely used.
Disk Array for RAID 3 and raid 4 odd or even checks
No matter how many data disks there are, one verification disk is used and the parity check method is used to check errors. Any damage to an independent disk drive can be recovered. Raid3 and raid4 read data quickly. However, when writing data, you need to calculate the value of the check bit to write it to the check disk. Raid3 and raid4 are rarely used.
Raid Level 5 non-independent disk parity disk array
The parity check is also used to check errors, but there is no independent disk for verification. The verification information is distributed on each disk drive. RAID5 has good performance in reading and writing large data volumes and is widely used.
From RAID 1 to RAID 5, the damaged disk can be pulled out at any time and inserted to the disk (Requires hot swapping support on the hardware) at any time. data will not be damaged, the contents of the invalid disk can be quickly rebuilt, and reconstruction is also completed by raid hardware or raid software. However, raid0 does not provide the error verification function, so some people say that it cannot be regarded as raid. In fact, this is why raid0 is called a level-0 raid. 0 itself means "no ".
1.3 raid applications
The speed bottleneck of the current PC system is the hard disk. Although ultra dma33, dma66, dma100 and other quick standards are continuously launched, the results are not very effective. In PC, the disk speed is not too slow. However, this is not allowed on the server. The server must be able to respond to service requests from all directions. Most of these requests are related to data on the disk, therefore, the disk subsystem of the server must have a high input/output rate. For data security, some fault tolerance functions are required. Raid provides these functions, so raid is widely used in server systems.
1.4 The Fault Tolerance function provided by raid is automatically implemented (by raid hardware or raid software ).
It appliesProgramIt is transparent, that is, applications do not need to work for fault tolerance. To achieve the highest security and the fastest recovery speed, you can use raid1 (image). to compromise the capacity, fault tolerance, and performance, you can use RAID 5. In most database servers, the disk drive of the operating system and database management system is RAID 1, and the database data files are stored on the disk drive of RAID 5.
1.5 sometimes we look at the configuration sheets of some famous-name servers and find that the CPU is not very fast, the memory is not very large, and the graphics card is not the best, but the price is definitely high. Is the server system a lucrative product? Of course not. The configuration of the server is different from that of the general home PC. In addition to higher stability, redundancy and fault tolerance are major features, such as dual-power supply, high-speed buffer with battery backup, hot swapping of hard disks, hot swapping of PCI slots, etc.
Another feature is huge disk throughput. This is mainly due to raid. For example, a Pentium 166 with SCSI raid and piiicopermine 800 with an IDE Hard Disk are used as file servers, pentium 166 will be dozens or even hundreds of times higher than p iii's transaction processing capabilities, because the p iii processor's computing power is not available at all, but it is the raid of Pentium 166.
1.6 raid is now mainly used on servers, but like any high-end technology, raid is also being transferred to PCs. Maybe the day when all PC machines use the raid on the SCSI disk drive is the real day of the PC ".