Resolve redundant raid Disk Arrays

Source: Internet
Author: User

Resolve redundant raid Disk Arrays

Preface

In the past, computer systems were often limited to writing information to a single disk. This type of disk is usually expensive and prone to failures. Hard disks have always been the most vulnerable part in computer systems, because these devices are the only mechanical part in systems where other components are completely electronic. The disk drive contains many high-speed active mechanical parts. The problem is not whether the hard drive is faulty, but when it is faulty.

RAID (Redundant Array of independent drive: Redundant Array of Independent Disks) is designed to completely change the way computers manage and access data in large-capacity storage by providing a cheap and redundant disk system. It was once referred to as a low-cost redundant disk array (RAID ). Raid writes data to multiple cheap disks, rather than writing data to a single large capacity (Sied ). Raid originally represented a Redundant Array of cheap disks, but now it has been changed to an independent Redundant Array.
Basic Principles of RAID

Raid achieves redundancy and fault tolerance through two measures: Band-based storage and parity. Striped storage means that files can be written to multiple disks by writing one data block at a time. The band-based storage technology writes data into multiple drives separately, improving data transmission rates and shortening the total disk processing time. This system is very suitable for transaction processing, but has poor reliability, because the reliability of the system is equal to the reliability of the worst single drive.

The parity check ensures the validity of all data through redundancy check after transmission. With parity check, when a disk in the raid system fails, other disks can reconstruct the faulty disk. In both cases, these functions are transparent to the operating system. The disk array controller (DAC) performs striped storage and parity control.

  Components

The main components of RAID are the disk array controller (DAC) and a queue consisting of five disks. Data is striped and stored on all five disks, and the faulty disk is restored using parity. Raid has multiple levels. Some RAID levels are used to increase the speed and some are used to provide protection, while raid-5 combines two advantages. We will discuss them one by one.

  Band-based Data Storage

Previously, the computer only writes files to one disk. Striped storage allows you to split files and write different fragments to multiple disks at the same time. If your files have 5 data blocks and are stored in 5 disks in a strip mode, each data block is written to its own disk. If you have five OLTP transactions, each of which is smaller than one data block, you can process five different transactions at the same time.

Most RAID levels perform band-based storage at the data block level, but raid can also perform band-based storage at the bit or byte level. The size of the data block is determined by the system administrator and is called the baseband stripe depth.

To maximize the transaction capability of the disk array subsystem, data must be written to multiple drives at the same time or read from multiple drives at the same time. To achieve this, user data blocks are striped and stored on the entire drive array. A baseband Bar contains a column of slices (each slice contains 512 bytes), which are located at the same position on each disk in the array. The baseband stripe depth (that is, the number of sectors in each data block) is defined by the subsystem software.

The baseband depth has a direct impact on the performance, because the system needs to execute more I/O commands than the actual ones if the depth is too small. If the defined depth is too large, the multi-task capability of the processor and the many benefits of Multi-drive may be offset.

In an ideal transaction environment, each request from the host involves only one drive, which can implement multiple concurrent transactions for multiple drives.

Storing data strip to an array drive solves the problem that one system drive is overloaded and the other is idle. Data strip storage avoids the use of dedicated drives, ensures that data processing loads are evenly distributed across available drives, and improves performance by writing multiple data blocks at the same time.

  Parity

People often confuse parity checksum images (or images ). An image involves copying a disk. An image is a technology that writes data to two drives at the same time. Because either of the two drives can complete the same task, these systems provide excellent reliability and deliver outstanding transaction processing results. But the cost is that you have to purchase two drives to get the capacity of only one drive. The image overhead is 100% or double the disk space. If a disk fails, the image disk will replace it for operation.

Parity provides the same general protection as the image, but the cost is low. If a user has an array consisting of five disks, four of them are used to store data, and one is used for parity. The cost is only 20%, which is a great advantage when you need to consider the cost.

The computer only uses 0 and 1 to represent data. Exclusive or (XOR) is a method for parity. Extract one digit (0 and 1) from each disk and add them together. If the sum is an even number, the parity is set to 0. If the sum is an odd number, the parity is set to 1.

Based on the raid level, the parity can be saved to one disk or allocated to all disks. When you use five disks, each type of parity occupies 1/5 or 20% of the disk space. When three disks are used, they account for 1/3 or 33%.

RAID configuration level

Raid refers to the combination of multiple independent hard drives to form a logical array. If one or more raid volumes are created on the array, the operating system no longer recognizes individual hard drives, but recognizes each volume as a single logical hard drive. Raid aims to improve the performance of the storage subsystem and support fault tolerance. The raid level defines how data is formatted in a volume containing an array of hard drives. I will introduce in detail the commonly used raid level 0 (segment), raid Level 1 (image), raid Level 5 (including parity segments) and raid level 10 (segment and image ).
RAID 0 (segment)
Raid level 0 combines two or more hard drives and segments all data into manageable blocks (called data blocks ). Table 2 describes the usage of the allowed data block size. These data blocks are distributed in strip form in the array members where the RAID 0 volume resides. This method improves read/write (especially sequential read/write) performance by allowing access to adjacent data from more than one hard drive at the same time. However, data stored in RAID 0 volumes is not redundant. Therefore, if a hard drive fails, all data in the volume will be lost.
Advantages: improved data access and storage performance, and no loss of data capacity.
Disadvantage: No data redundancy (if a hard drive fails, all data in the volume will be lost ).
Applications: typically used to store high-performance, non-critical data and software on desktops and workstations.

 
Raid 1 (image)
Raid Level 1 combines two hard drives so that all data is simultaneously written to the array members where the Raid 1 volume resides. In other words, the data is mirrored to each hard drive in the raid 1 volume. This achieves real-time redundancy of all data by creating an image of all data on the first drive on the second drive.
Advantage: full data redundancy improves reading speed.
Disadvantage: the storage capacity can only be equal to the minimum drive capacity; the write speed is slightly reduced.
Applications: typically used to store key data on workstations and servers.

RAID 5 (including parity segments)
Raid Level 5 combines three or more hard drives and segments all data into manageable blocks (called data blocks ). RAID 5 uses parity. This is a mathematical method used to reconstruct lost data to a single drive to improve fault tolerance. Data and parity are rotated across array member segments. Due to the use of parity segments, it is possible to reconstruct Data after replacing the faulty hard drive with a new drive. The extra work of calculating lost data during Data Reconstruction will reduce the performance of writing data to the volume. RAID 5 provides better performance for processing small I/O functions than when processing large sequential files.
Advantages: full data redundancy; high efficiency; fault tolerance; improved storage array performance.
Disadvantage: reconstruction takes time and performance is reduced while reconstruction is in progress.
Applications: Good choice for file and application servers and Internet and LAN servers.

Raid 10 (segment and image)
Raid level 10 uses four hard drives to create a combination of RAID level 0 and 1. Data is distributed in a dual-drive array consisting of RAID 0 components in the form of strip. Each hard drive in the RAID 0 array uses a hard drive in the raid 1 array as an image. This configuration has the advantages of both RAID 0 performance and RAID 1 redundancy.
Advantage: improved performance and complete data redundancy.
Disadvantage: Four hard drives (two for redundancy) are required, resulting in increased costs.
Application: database servers require high performance and fault tolerance.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.