In Linux, how does one set RAID 10 to ensure high-performance and fault-tolerant disk input/output? (1)
RAID 10, also known as RAID 1 + 0 or image strip) array combines the features of RAID 0 and RAID 1 to provide high-performance, fault-tolerant disk input/output operations. In RAID 0, read/write operations are performed across multiple drives concurrently; in RAID 1, identical data is written to two or more drives.
In this tutorial, I will introduce how to build a software RAID 10 Array Using five identical 8 GiB disks. Although the maximum number of disks used to build a RAID 10 array is four, for example, a group of two striped images), we will add an additional backup drive, in case of a primary drive failure. We will also introduce some tools that you can use to analyze the performance of RAID arrays in the future.
Note: all the advantages and disadvantages of RAID 10 and other partitioning schemes and drive and file systems of different sizes are not covered in this article.
How does a RAID 10 array work?
If you need to implement a storage solution that supports input/output-intensive operations such as databases, emails, and website servers, RAID 10 is the right option. The following describes the source committee. Take a look.
Image 1 and Image 2
Suppose A file is composed of data blocks A, B, C, D, and E. Each RAID 1 image set, such as image 1 or image 2, copies data blocks to each of the two devices. Due to this configuration, the write performance is reduced because each data block is written twice and each disk is written once. Compared with reading data from a single read disk, the read performance remains unchanged. This solution provides redundancy because disk input/output operations can be maintained unless multiple disks in each image fail.
The principle of RAID 0 strip is to divide data into multiple data blocks, write data block A to image 1, write data block B to Image 2, and so on, this improves the overall Read and Write Performance. On the other hand, no image contains the complete information of any part of the data submitted to the master set. This means that if one of the images fails, the entire RAID 0 component and the RAID 10 set will not be able to operate and data will be lost.
Build a RAID 10 Array
There are two possible construction schemes for RAID 10 Arrays: a complex scheme can be built in one step), or a nested scheme is built using two or more RAID 1 arrays first, use them as component devices in RAID 0 ). In this tutorial, we will introduce how to build a complex RAID 10 array, because this array allows us to use an odd or even number of disks, in addition, it can be managed as a single RAID device, rather than introducing a nested solution that only allows an even number of drives and must be managed as a nested device. RAID 1 and RAID 0 can be processed separately ).
Assume that you have installed mdadm and the background program runs on your system. For more information, see http://xmodulo.com/create-software-raid1-array-mdadm-linux.html. In addition, assume that the primary partition sd [bcdef] 1 has been created on each disk. Therefore, the output of ls-l/dev | grep sd [bcdef] should be as follows:
Run the following command to build a RAID 10 array:
# Mdadm -- create -- verbose/dev/md0 -- level = 10 -- raid-devices = 4/dev/sd [bcde] 1 -- spare-devices = 1/dev/sdf1
It may take several minutes to build the array.) # The output of mdadm -- detail/dev/md0 should be as follows:
Before proceeding to the next step, we need to describe several points.
1. Used Dev SpaceIndicates the capacity of each member device used by the array.
2. Array SizeThe total size of the array. For RAID 10 arrays, this is equivalent to N * C)/M, where N refers to the number of active devices and C refers to the capacity of active devices, M indicates the number of devices in each image. So here, N * C)/M is equivalent to 4*8GiB)/2 = 16GiB.
3. LayoutDetails of the index data layout. Possible layout values are as follows.
• N default option): indicates the near copy. Multiple copies of a data block are at a similar offset in different devices ). The read and write performances provided by this layout are similar to those of RAID 0 arrays.
• O indicates offset copy. Instead of copying data segments in the strip, the whole strip is replicated, but rotated by a device, so that duplicate data blocks are distributed across different devices. Therefore, the subsequent data blocks are copied to the next drive and moved down one data segment. To use this layout for your RAID 10 array, add -- layout = o2 to the command used to build the array.
• F indicates multiple copies with different far copy offsets ). This layout provides good read performance, but provides poor write performance. Therefore, this solution is most suitable for systems that require much more read operations than write operations. To use this layout for your RAID 10 array, add -- layout = f2 to the command used to build the array.
In the -- layout option, the numbers following n, f, and o indicate the number of copies of each data block. The default value is 2, but it can be a multiple of the number of devices on the disk. By providing enough copies, You can minimize the impact of input/output on a single drive.
4.Chunk SizeAccording to the Linux RAID wiki, the chunk size of the Data Segment refers to the minimum unit of data written to the device. The optimal data segment size depends on the speed of input/output operations and the size of related files. If a large file is written, as long as the data segment is large, it is expected to see a low overhead, and the array that stores small files is expected to benefit more from smaller data segments. To specify the size of a Data Segment for your RAID 10 array, add -- chunk = desired_chunk_size to the command used to build the array.
Unfortunately, there are no comprehensive methods to improve performance. The following are several guiding principles worth consideration.
• File System: In general, XFS is said to be the best file system, while EXT4 is still a good choice.
• Optimal layout: The far layout improves read performance but reduces write performance.
• Number of replicas: more replicas minimize the impact of input/output, but also increase costs when more disks are needed.
• Hardware: Solid State disks are more likely to show higher performance in the same environment than traditional rotating disks ).