Brief Introduction to RAID Disk Arrays

Last Update:2015-08-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

RAID is the abbreviation of "Redundant Array of Independent Disk". It is translated as a Redundant Array of Independent Disks, which is actually the storage, access, and backup technology of disks. Before talking about RAID, let's take a look at some basic knowledge about memory.

1. Storage Basics

When it comes to memory, as its name implies, it is used to store data. There are many types of memory on the market, but they can also be roughly divided into two categories: volatile memory and non-volatile memory. If the former loses power, the data will be lost, the latter still loses power data, which is determined by the memory medium. In general, the access speed of the volatile memory is significantly higher than that of the non-volatile memory, and the price is also high. First, let's take a look at the Memory Hierarchy in the computer system. It is a memory hierarchy map provided in any book about the computer system) the faster the storage speed, the higher the cost of each byte. The lower the storage speed (the lower the pyramid), the lower the cost of each byte.

The fastest speed is the register in the CPU, because it is expensive, so there are only dozens of bytes.

In L1, L2, and L3 layers, high-speed caching is fast. It is implemented by SRAM (static random storage), and only a few ns are required for CPU access to high-speed caching, because the price is relatively expensive, only a few M ~ Dozens of MB of storage space.

The primary storage is located on the L4 layer. Although the operation speed is not cache fast, it is much cheaper. It is implemented by DRAM (dynamic memory, in all modern computers, there are usually several GB of primary memory, and the speed of CPU access to the primary memory is generally dozens of ns ~ Between hundreds of ns.

Memory is divided into two types: volatile and non-volatile memory. In computer systems, volatile memory mainly includes high-speed cache and primary memory. volatile memory is generally implemented by random memory (RAM. RAM is divided into static RAM (SRAM) and dynamic RAM (DRAM). SRAM is faster than DRAM, but the price is relatively high. High-speed cache uses SRAM, while primary memory uses DRAM. Non-volatile memory mainly includes disks, solid state disks, CDs, tapes, and floppy disks.

A local disk is located on the L5 layer. A local disk generally uses mechanical storage. That is to say, access to the disk may cause mechanical losses. It takes about 10 ms to transmit a byte between the disk and the memory. Database Data is usually stored on disks, so in order to reduce the time required to access the disk due to the addition, deletion, query, modification, and so on of the database, we need to establish a series of algorithms and data structures to maintain the database.

A disk is composed of disks. Each disk has two sides and each side is divided into many concentric circles. Each concentric circle is called a track, and each track is separated by a gap into a circle segment, which is called a sector. The slice is the smallest and inseparable unit of the disk. Each disk has a head to read and write data on the disk. When the primary storage needs to access data on the disk, it is controlled by the disk drive, data is stored in a certain sector of a track on a disk. Therefore, we need to move the head to the corresponding track. This is called seeking, and the time consumed by searching is called seeking time; when moving the head to the corresponding track, we have to rotate it to the corresponding sector. The delay caused by rotating it to the corresponding sector is called the rotation delay; the time required for the gap between the sector and sector of the data to pass through the head is the transmission time.

Therefore, the time required for one disk access is: Seek time + rotation time + transmission time. In fact, this is mainly determined by the disk rotation speed and number of sectors. Generally, 7200 RPM is used, and the average access time of the disk with a sector gap accounting for 10% is about 10 ms.

Because disks are non-volatile and cost-effective, almost all data storage and backup services are currently on disks.

2. improve reliability through redundancy

Any device may fail, and the memory is no exception. How to solve reliability ?? The answer is redundancy. The simplest way to achieve redundancy is to copy each disk, which is called an image. A Logical Disk consists of two physical disks. Each write operation must be performed on the two disks. If one of the disks fails, we can read data from another disk. Data will be lost only when the first disk fails and the second disk fails before it is repaired. We use the average failure time (here the failure index is lost) to evaluate the performance of image technology, the average failure time mainly depends on the average failure time and average repair time of each disk (the time used to replace the faulty disk and recover data on the disk ).

Assume that the two disks are independent from each other when faults occur. The average failure time of a single disk is 100 hours, and the average repair time is 10 hours, the average failure time of the image disk is 100 0002/(2*10) = 500*106 hours, that is, 57000 years. Of course, this is only an ideal situation, and there are other factors to consider.

3. improve performance through Parallelism

The disk cost is low, but the access speed is limited. The average access time is about 10 ms, which seriously affects the system performance. How can we increase the disk access speed? Through parallel access, the speed of reading data will double because read requests can be sent to any disk. The transmission rate of each read operation is the same as that of a single disk system, but the number of read operations per unit time has doubled. Generally, data is split on multiple disks to increase the transmission rate.

There are also many data sharding forms: each byte is separated by bit and stored on multiple disks, called bit-level splitting. Splitting a block into multiple disks is called block-level splitting.

Block-level splitting is the most common form of splitting. When we want to store 8 logical blocks to disks, we can store them on four disks, numbered 0, 1, and 2, respectively, block I data is stored on the disk (I mod 4) so that we can read the data from four disks in parallel each time.

Regardless of the format, the parallel disk system uses the following principle to improve performance:

(1) load balancing multiple small access operations to improve the throughput of access operations

(2) execute large access operations in parallel to reduce the response time of large access operations

4. RAID

Although the image has high reliability, each Logical Disk requires two physical disks at a high cost. Although data splitting improves the speed, it does not guarantee reliability. As a result, a series of replacement solutions were proposed, which had a trade-off between different costs and performance, and divided these solutions into several RAID levels. These solutions are based on the concept of parity checksum and data splitting.

RAID level 0: only simple block-level splitting without any redundant disk arrays. This level of RAID only increases the speed of reading data from the disk, but there is no guarantee of reliability. When a disk fails, there is no way to recover it.
RAID Level 1: use disk images that are split at the block level (RAID level 1 + 0 is used by many enterprises to refer to the split image, while RAID Level 1 is used to refer to images that do not use the split ). Because data splitting is available, parallel reading can be used to increase the read rate. Because images are available, all of them have high reliability. At the same time, Data Reconstruction is simple and provides high write performance, however, the cost is high.
RAID Level 2: uses the parity bit, also known as the memory-style Error Correction Code (ECC) organization structure.
RAID Level 3:

For example, assume that there are three disks, each of which has only one slice and each slice has only eight bits.
Disk: 1: 11110000
Disk
Disk 3:00111000
We also need a redundant disk as the parity bit:
Disk 4: 01100010
In the above eight bits, each of the four disks has an even number and an even number. If the data on Disk 2 is changed from 10101010 to 11001100, we can restore the data on Disk 2 through several other disks. The data in any disk is the modulo 2 and corresponding data in other disks.

RAID Level 4: the cross-block parity organization structure. It uses block-level splitting to reserve a parity block for the corresponding blocks of N other disks on an independent disk. You can execute multiple read operations in parallel to produce a high total I/O transfer rate. Writing a large amount of data also has a high transfer rate (data and check bit can be written in parallel ). However, the data volume cannot be executed in parallel within an hour. Writing a block requires simultaneous access to the storage disk and the parity disk, because the parity disk needs to be updated.
RAID Level 5: the organizational structure of the distributed parity bit of the block crossover is an improvement for RAID Level 4. RAID Level 5 distributes data and parity bits to all N + 1 disks, and all disks can participate in read operations. For each Logical Disk block consisting of N, N + 1 physical disk block is required, the block storage parity bit corresponding to one disk, and the block storage data corresponding to the remaining N disks.
RAID level 6: P + Q redundancy solution. Similar to RAID 5, only additional redundant information is stored to cope with the simultaneous failure of multiple disks.

5. How to Select a RAID level

For specific analysis of specific problems, the following factors are generally taken into account:

(1) costs of additional storage required

(2) performance requirements in the number of I/O operations

(3) Performance of disk faults

(4) performance during Data Reconstruction (fault recovery)

RAID level 0 can only be used in applications with low data security requirements;

RAID Level 3 is an improvement of RAID Level 2. RAID Level 5 is an improvement of RAID Level 4. We only need to consider RAID Level 3 and RAID Level 5. RAID Level 3 is a bit-level split, while RAID Level 3 is a block-level split. The block-level split provides the same transmission rate for massive data transmission as RAID Level 3, use less disks for transmission of a small amount of data. RAID level 6 is more reliable than RAID level 5 and can be used for data security applications. RAID Level 1 provides the best write operation performance. RAID Level 5 has lower storage load than RAID Level 1, but write operations require higher time overhead, therefore, RAID 5-level applications and read operations are performed on applications with fewer write operations.

How to build a RAID 10 array on Linux

Debian soft RAID Installation notes-use mdadm to install RAID1

Common RAID technology introduction and demo (Multi-chart)

The most common disk array in Linux-RAID 5

RAID0 + 1 and RAID5 Performance Test Results

Getting started with Linux: disk array (RAID)

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Brief Introduction to RAID Disk Arrays

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Brief Introduction to RAID Disk Arrays

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support