How to set RAID 10 on Linux to ensure high performance

Source: Internet
Author: User
Tags arrays file size

RAID 10 (also called a raid 1+0 or mirrored stripe) array combines the functionality features of RAID 0 and RAID 12 to provide high-performance, fault-tolerant disk input/output operations. In RAID 0, read/write operations execute across multiple drives, and in RAID 1, identical data is written to two or more drives.

In this tutorial, I'll describe how to build a software RAID 10 array using 5 identical 8 GIB disks. Although the maximum number of disks used to build a RAID 10 array is 4 (for example, a group of two mirrors), we will add an extra spare drive in case one of the main drives fails. We'll also introduce tools that you can use later to analyze the performance of RAID arrays.

Note that it is beyond the scope of this article to thoroughly describe all the pros and cons of RAID 10 and other partitioning schemes (as well as drives and file systems of different sizes).

How does a RAID 10 array work?

RAID 10 is the right choice if you need to implement a storage solution that supports input/output intensive operations, such as databases, e-mail, and Web site servers. Here's how it's going to be described. You might want to take a look at the picture below.

Bands of mirrored 1 and mirror 2

Imagine that a file consists of blocks a, B, C, D, and E in the diagram above. Each RAID 1 mirror set (such as Mirror 1 or mirror 2) copies data blocks to each of the two devices. Because of this configuration, write performance has declined because each block of data is written two times, each disk is written one at a time, and the read performance remains unchanged compared to reading from the single read disk. The advantage is that this scenario provides redundancy because a normal disk input/output operation can be maintained unless multiple disks in each mirror fail.

The RAID 0 stripe works by dividing the data into multiple blocks of data and writing block A to mirror 1 while writing block B to mirror 2, and so on, thus improving overall read and write performance. On the other hand, none of the mirrors contains complete information about any part of the data submitted to the master set. This means that if one of the mirrors fails, the entire RAID 0 component (and therefore the raid 10 set) cannot operate and the data is lost without recovery.

Building a RAID 10 array

There are two possible building scenarios for a RAID 10 array: A complex scenario (built in one step), or a nested scenario (constructed by building two or more RAID 1 arrays and then using them as component devices in RAID 0). In this tutorial, we'll describe building a complex RAID 10 array, the reason for this is that arrays allow us to use odd-or even-numbered disks and can be managed as a single raid device, rather than introducing nested scenarios, which allow only a number of even-numbered drives that must be managed as nested devices. RAID 1 and RAID 0 are handled separately.

Suppose you have Mdadm installed, the daemon runs on your system. To learn more, see this tutorial: http://xmodulo.com/create-software-raid1-array-mdadm-linux.html. It is also assumed that the primary partition sd[bcdef]1 has been created on each disk. Thus, ls-l/dev | The output of grep sd[bcdef] should be this way:

Next, use the following command to build a RAID 10 array:

# mdadm--create--verbose/dev/md0--level=10--raid-devices=4/dev/sd[bcde]1--spare-devices=1/dev/sdf1

After the array is built (the build process should not take a few minutes), # The output of Mdadm--detail/dev/md0 should be this:

There are a few things that need to be explained before we proceed to the next step.

1. Used Dev Space indicates the capacity of each member device used by the array.

2. Array size refers to the total sizes of the array. In the case of a RAID 10 array, this is equivalent to (n*c)/M, where N refers to the number of active devices, C refers to the capacity of the active device, and M refers to the number of devices in each mirror. So here, (n*c)/M equals (4*8GIB)/2 = 16GiB.

3. Layout refers to the specific details of the data layout. The possible layout values are shown below.

N (default option): Means near copy. Multiple copies of a block of data are similarly offset (offset) in different devices. This layout provides read and write performance similar to the performance of a RAID 0 array.

o indicates an offset copy. Not the data segment is copied inside the stripe, but the entire stripe is copied, but it is rotated by a device so that the duplicate blocks are distributed across different devices. As a result, subsequent blocks of data are copied to the next drive and a data segment is moved down. To allow your RAID 10 array to use this layout, add--layout=o2 to the command used to build the array.

The f indicates a far copy (offset by a completely different number of copies). This layout provides better read performance, but provides poor write performance. Thus, this scheme is best suited for systems that need to support more read operations than write operations. To allow your RAID 10 array to use this layout, add--layout=f2 to the command used to build the array.

The numbers followed by N, F, and O in the--layout option indicate the number of copies required for each block of data. The default value is 2, but it can be twice times the number of devices in the disk. By providing a sufficient number of copies, you can minimize the input/output impact of a single drive.

4. Chunk size, according to the Linux raid wiki, the data segment size (Chunk size) refers to the smallest unit of data written to the device. The optimized data segment size depends on the speed of the input/output operation and the size of the associated file. If you write a large file, you can expect to see a lower overhead as long as you make sure that the data segment is quite large, and the array of primary storage small files is expected to benefit from smaller segments of data. To specify a data segment size for your RAID 10 array, add--chunk=desired_chunk_size to the command used to build the array.

Unfortunately, there are no readily available methods to improve performance. Here are a few guidelines that are worth considering.

• File systems: In general, XFS is said to be the best file system, and EXT4 is still a good choice.

• Optimized layout: Far layout improves read performance, but reduces write performance.

• Number of replicas: more replicas minimize input/output effects, but additional costs are added when more disks are needed.

• Hardware: Solid-State drives are more likely to show the benefits of performance improvement (in the same environment) than traditional rotating disks.

Test RAID performance with DD

The following benchmark tests can be used to verify the performance of our RAID 10 array (/dev/md0).

1. Write operation

A single file of 256MB size is written to the device:

# dd If=/dev/zero of=/dev/md0 bs=256m count=1 Oflag=dsync

512 bytes are written 1000 times:

# dd If=/dev/zero of=/dev/md0 bs=512 count=1000 Oflag=dsync

Because of the dsync tag, DD bypasses the system file cache and performs synchronous write operations to the RAID array. This option eliminates the caching effect during RAID performance testing.

2. Read operation

256kib*15000 (3.9 GB) copy from array to/dev/null:

# dd if=/dev/md0 of=/dev/null bs=256k count=15000

Using IOzone to test RAID performance

Iozone (http://www.iozone.org) is a file system benchmarking tool that allows us to measure numerous disk input/output operations, including random read/write, sequential read/write, and re-read/re-write. It can export the results to Microsoft Excel or LibreOffice calc files.

Install IOzone to Centos/rhel 7

Enable the Repoforge software Library, and then execute the following command:

# yum Install IOzone

Install IOzone on Debian 7

# Aptitude Install Iozone3

The following IOzone command will perform all tests in the RAID-10 array:

# Iozone-ra/dev/md0-b/tmp/md0.xls

-r: Generates a report that is compatible with Excel and is sent to a standard output device.

-a: Run IOzone in fully automatic mode, covering all tests and possible records/file sizes. Record size: 4k to 16M, file size: 64k to 512M.

-b/tmp/md0.xls: Stores test results in a specified file.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.