Getting started with Linux: disk array (RAID)
RAID is a Redundant Array of low-cost Disks (Redundant Array of Inexpensive Disks). RAID technology combines individual Disks into a logical hard disk in different combinations, this improves the disk reading performance and data security. Different combinations are identified at the RAID level.
With the continuous development of RAID technology, we now have six clear and standard RAID levels, from RAID 0 to RAID 5.
In addition, there are 6, 7, 10 (RAID 1 and RAID 0 combination), 01 (RAID 0 and RAID 1 combination), and 30 (RAID 3 and RAID 0 combination) and 50 (RAID 0 and RAID 5.
Different RAID levels represent different storage performance, data security, and storage costs. The following describes the RAID levels: 0, 1, 2, 3, 4, 5, 6, 01, 10.
RAID0
RAID0, also known as the striping (stripe), divides the data into disks of write arrays in a certain size sequence. RAID0 can perform read/write operations in parallel and make full use of the bandwidth of the bus, theoretically, a RAID 0 system consisting of N disks has N times the read/write performance of a single disk.
In addition, the maximum storage efficiency of disk space (100%) RAID0 has an obvious drawback: data redundancy protection is not provided, and once the data is damaged, it cannot be restored.
RAID1 RAID1 becomes an image (mirror), which writes data to the work disk and the image disk in full consistency. Therefore, its disk space usage is 50%, which may affect the data write time, but there is no impact on reading. RAID0 provides the best data protection. Once a work disk fails, the system automatically reads data from the image disk without affecting user work.
RAID5
RAID5 is similar to RAID3, but the data verification information is evenly distributed to each disk of the array, so that there is no performance bottleneck for disk verification during concurrent write operations.
The disk of the array contains both data and data verification information. Data blocks and corresponding verification information are stored on different disks. When a data disk is damaged, the system can reconstruct damaged data based on other data blocks in the same zone and the corresponding verification information. RAID 6 RAID 6 provides two levels of redundancy, that is, when two drives in the array fail, the array can continue to work.
Generally, RAID 6 has the highest implementation cost, because RAID 6 not only supports data recovery, but also supports verification recovery, this makes RAID 6 controllers more complex and expensive than other r a I D controllers.
1. Check data of RAID 6
When performing write operations on each data block, RAID 6 performs two independent verification calculations. Therefore, RAID 6 supports failure of two disks. To achieve this idea, there are basically two methods that have been accepted:
? Multiple algorithms are used, such as x o r and some other functions.
? Sort data on different data shards or disks.
2. The first method of RAID 6's one-dimensional redundant RAID 6 is to calculate the verification data using two different methods. The easiest way to achieve this idea is RAID10. RAID10 is the combination of RAID1 and RAID0, also known as RAID (0 + 1). First mirroring and then striping, which not only improves the read/write performance of the system, with data redundancy protection, the disk space usage of RAID 10 is the same as that of RAID 1, which is 50%. RAID10 is applicable to fields that require both massive data storage and strict data security requirements, such as finance and securities.
RAID01
RAID01 is also a combination of RAID0 and RAID1, But it mirrors the striped data. However, unlike RAID10, the loss of a disk is equivalent to the loss of the entire image strip. Therefore, once the image disk fails, the storage system becomes a RAID-0 system (that is, only a strip ). RAID01 has very few practical applications.
Comparison of Different RAID levels
The most widely used raid levels are raid0, raid1, raid10, and raid5, which are explained by Baidu encyclopedia. The following describes my understanding of disk arrays, but there are many advantages:
(1) improves the storage capacity;
(2) second, multiple disk drives can work in parallel, improving the data transmission rate;
(3) RAID technology indeed provides higher performance indicators, data integrity, and data availability than general disk storage;
(4) especially when I/O is always lagging behind the bottleneck of CPU performance, the RAID solution can effectively make up for this gap.
Debian soft RAID Installation notes-use mdadm to install RAID1
Common RAID technology introduction and demo (Multi-chart)
The most common disk array in Linux-RAID 5
RAID0 + 1 and RAID5 Performance Test Results
RAID 1 Implementation
We can't do anything without practice. Today we are going to create a software-level RAID 1 disk array.
(1) To reflect the concurrency between disks, we must first create several disks. The most important thing is that the sizes of these disks must be the same, because their business logic is the same, the full migration of business logic can only be performed in the case of the same size.
Prepare the device used to build the RAID
Create a device (MB) and change the system ID of the device to fd (representing the disk array ).
Then we create two disk array devices in the same way.
After saving and exiting, run the partx command to perform a series of operations.
For more details, please continue to read the highlights on the next page: