Raid technical details 2

Source: Internet
Author: User
Raid is fully called the Redundant Array of disks. It is the abbreviation of "Redundant Array of Independent Disks" (originally "Redundant Array of cheap disks. In 1987, Patterson, Gibson, and Katz were defined in an article by the UC Berkeley branch. Raid array technology allows you to group A series of disks to achieve data redundancy required for data protection, as well as data strip distribution to improve read/write performance. Raid is initially used in the high-end server market. However, with the rapid development of computer technology, RAID technology has penetrated into various computer fields. Nowadays, raid control chips can be seen everywhere on home computer boards.

Generally, raid systems can be deployed on various interfaces. For now, PATA, SATA, and SCSI have corresponding hard disks to form raid. With the release of the Intel 865/875 series chipset, the hard disk interface in the home market began to switch to SATA, And the raid method will also transition from PATA to Sata.

With the use of RAID technology, people have experienced a series of changes and developments. In the home market, we can only see RAID 0, RAID 1, and RAID 0 + 1 disk arrays. However, since the birth of the DFI lanparty motherboard, We have ushered in the fourth disk array method, raid 1.5.

In practical applications, most of the scenarios where raid is set up are to further improve disk read/write performance, and data backup can be achieved by other methods (such as burning ). Therefore, people are willing to try RAID 0 with only two hard disks, but the birth of RAID 1.5 has changed this concept. What is the performance of these two relatively inexpensive disk arrays? Let's reveal it to you.

 

RAID 0:

RAID 0 uses a technology called striping to distribute data to various disks. Where each "band" is distributed to a continuous "Block", data is divided into several blocks ranging from 512 bytes to several megabytes, and then written to the disk in turn. 1st are written to disk 1, 2nd are written to Disk 2, and so on. When the system reaches the last disk in the array, it is written to the next segment of Disk 1.

Split data can evenly distribute I/O loads to all drives. Because the drive can be written or read at the same time, the performance is significantly improved. However, it does not have data protection capabilities. If a disk fails, all data is lost. Therefore, RAID 0 is not suitable for key task environments, but it is very suitable for video and image production and editing.

Raid 1:

Raid 1 is also called a mirror because data on one disk is completely copied to another disk. If data on one disk is incorrect or a bad track occurs on the hard disk, the other hard disk can recover data loss and system interruption caused by disk failure. In addition, RAID 1 can also implement Duplex-that is, you can copy the entire controller, so that your data can be protected when disk or controller failure occurs. The disadvantage of mirroring and duplex is that a doubling of the number of drives is required to copy data, but the read/write performance of the system does not increase. This may be a huge expense. Raid l can be implemented by software or hardware.

Raid 2:

Raid 2 is a disk array with Haiming code verification developed for mainframes and supercomputers. 1st, 2nd, and 4th disk drive groups ...... The N-Power Disk Drive of 2nd is a dedicated disk for verification and error correction. For example, raid 2, 1st, 2, and 4 disk drives (red) created by seven disk drives are error correction disks, and others (purple) are used to store data. Raid 2 has extremely high performance for reading and writing large amounts of data, but the performance of reading and writing a small amount of data is poor, so raid 2 is rarely used.

Because of the special characteristics of raid 2, as long as we use more disk drives, the smaller the percentage of disk verification disks. If you want to achieve a better speed and better disk utilization, it is best to increase the hard disk that saves the verification code ECC code, but this requires more hard disk purchase costs to ensure data redundancy. For the controller design, it is simpler than RAID 3, 4 or 5.

RAID 3:

RAID 3 is a strip with a dedicated parity (parity. Each stripe has a space equivalent to a block to store redundant information, that is, parity. The parity is the encoding information. If the data on a disk is incorrect or the disk is faulty, you can use it to restore the data. In a data-intensive environment or a single user environment, creating RAID 3 is advantageous for a long continuous access record. However, like raid 2, the performance may decrease when the access record is short.

Raid 4:

Raid 4 is an independent disk structure with a parity code. It is similar to raid 3. The difference is that raid 4 accesses data by data block. RAID 3 is a horizontal bar, while raid 4 is a vertical bar at a time. Therefore, RAID 3 usually needs to access all hard drive in the array, while raid 4 only needs to access useful hard drive. In this way, the Data Reading speed is greatly improved, but in terms of data writing, the old data and new data retrieved from the data hard drive and the hard drive must be verified, then, the updated data and check bit are written to the hard drive, so the processing time is longer than RAID 3.

RAID 5:

RAID 5 is also called a strip with distributed parity. Each strip has a place as large as a block to store parity. Unlike RAID 3, RAID 5 distributes the parity information across all disks rather than one disk, which greatly reduces the load on the parity disk. Despite some capacity losses, RAID 5 provides perfect overall performance and is also a widely used disk array solution. It is suitable for input/output intensive, high read/write ratio applications, such as transaction processing.

In order to have Raid Level 5 redundancy, we need at least three disk arrays. RAID 5 can be implemented either through the disk array controller hardware or through some network operating system software.

Raid 6:

Raid 6 is an independent disk structure with two distributed storage parity codes. It uses the second parity check allocated to different disks to implement enhanced RAID 5. It can withstand multiple drive failures at the same time. However, it takes a lot of time to calculate the parity value and verify the data correctness, resulting in heavy system load and greatly reducing the overall disk performance, moreover, the system requires an extremely complex controller. Of course, since the second parity value is introduced, we need n + two disks.

Raid 7:

Raid 7 comes with an intelligent real-time operating system and a software tool for storage management. It runs independently of the host and does not occupy host CPU resources. The storage computer operating system (RAID 7) is a real-time event-driven operating system. It is mainly used to initialize the system and arrange all data transmission in the raid 7 disk array, and convert them to the corresponding physical storage drive. You can use the storage computer operating system to set and control the read/write speed so that the I/O transmission performance of the host can be optimal. If a disk fails, you can perform automatic recovery and manage the rebuilding process of the backup disk.

Raid 7 adopts non-synchronous access, which greatly reduces the Data Writing bottleneck and increases the I/O speed. (Non-synchronous access, that is, each I/O interface of RAID 7 has a dedicated expressconnect, which serves as the circulation path of data or control information, therefore, you can independently control the data access to each disk in your system .) If raid 7 has n disks, apart from one verification disk (for redundant computing), it can simultaneously process random read/write commands from N-1 host systems, this significantly improves I/O applications. The built-in real-time operating system of RAID 7 can automatically optimize the read/write commands sent by the host, and intelligently read data into the cache in advance, this greatly reduces the rotation times of the head and increases the I/O speed. Raid 7 can help users effectively manage increasingly large data storage systems and improve the system operation efficiency by at least one time, meeting the different needs of various users.

Raid 10 (RAID 0 + 1 ):

Raid 10, also known as the Image array strip. Now we generally call it RAID 0 + 1. Raid 10 (RAID 0 + 1) provides 100% data redundancy and supports larger volumes. Raid 10 (RAID 0 + 1) requires four disks, two of which are strip data distribution, providing read/write performance of RAID 0, and the other two are the images of the first two hard disks, ensures complete data backup.

Raid 30:

Raid 30 is also called a dedicated parity array strip. It has the features of RAID 0 and RAID 3. It is composed of two RAID 3 disks (three disks in each group) and uses dedicated parity bits, these two disks form a RAID 0 array to extract data across disks. Raid 30 provides fault tolerance and supports larger volumes. Like raid 10, raid 30 provides high reliability because data is still available even if two physical disk drives fail (one in each array.

Raid 30 requires a minimum of six drives, which are best suited to non-interactive applications such as video streams, graphics, and image processing. These applications process large files sequentially and require high availability and high speed.

Raid 50:
Raid 50 is called a distributed parity array band. Similar to raid 30, it has the common features of RAID 5 and RAID 0. It consists of two groups of RAID 5 disks (each group has at least three disks), each group uses distributed parity, and the two hard disks are created in RAID 0, the experiment extracts data across disks. Raid 50 provides reliable data storage and excellent overall performance, and supports larger volumes. Even if two physical disks fail (one in each array), data can be restored smoothly.

Raid 50 requires at least 6 drivers. It is ideal for applications that require high reliability storage, high read speed, and high data transmission performance. These applications include transaction processing and office applications that allow many users to access small files.

Raid 53:

Raid 53 is called an ultra Data Transfer Disk structure. The implementation of the structure is the same as the level 0 Data bar array, where each segment is a raid 3 array. Its redundancy and fault tolerance capabilities are the same as RAID 3. This is beneficial to systems that require RAID 3 with a high data transmission rate, but it is expensive and inefficient.

Raid 1.5:

Raid 1.5 is a new disk array. It has the RAID 0 + 1 feature. The difference is that it only requires two hard disks. On the surface, the two disks created after raid 1.5 have the same data. Of course, raid 1.5 is also a disk array mode that cannot fully utilize disk space. Therefore, two 80 GB hard disks are the same as RAID 1 after raid 1.5 is created, that is, only 80 GB of space is actually used, and 80 GB is its backup data. If you separate two hard disks and run them on the original system, they are also unobstructed. However, in actual use, we found that if the two hard disks are run separately, slight changes in their data will cause the re-reorganization of the disk array, and there is no way to achieve full data recovery, the disk with less data prevails.

Although raid 1.5 and RAID 1 have very similar effects, how does it test the RAID 0 read/write operations? So far, we have no material to prove the following assumption: the disk array control chip has advanced control functions, the two disks can record the same data in a strip mode at the same time, but the controller can identify the program strip to be read when the disk needs to be read, then read different strip from different hard disks to Improve the Performance of RAID 0.

In fact, for home users, the implementation of RAID 1.5 is generally due to performance improvement. After all, raid 1.5 has the RAID 0 feature. Therefore, we conducted a comparative test on the Performance of RAID 1.5. Next, let's take a look at the performance of RAID 1.5:

Raid level RAID 0 Raid 1 RAID 0 + 1 Raid 1.5
Name Strip Images Strip + image Strip + image
Fault allowed No Yes Yes Yes
Redundancy type None Copy Copy Copy
Hot Backup operation Not Allowed Yes Yes Yes
Hard Disk quantity More than 1 2 Four 2
Available capacity Max Minimum Intermediate Intermediate
Capacity Reduction None 50% 2 Disks 1 Disk
Read Performance High (determined by the number of disks) Intermediate Intermediate High
Random Write Performance Highest Intermediate Intermediate High
Continuous Write Performance Highest Intermediate Intermediate High
Typical applications Fast and fault-free read/write Allows Writing of faulty small files and random data Allows the transmission of faulty small files and continuous data Allows the transmission of faulty small files and random data

Performance test:

Hardware Platform
CPU

Intel pentium4 3.0 GHz (Northwood, 800 MHz FSB)

Graphics card

Kai Heng grand Yin Jia ti4680 Vivo

Driver:
NVIDIA detonator 4 v43.50

Memory Kingmax ddr400 ddrram 256mb × 2

The memory parameters are uniformly set in DDR 400 auto mode.

Hard Disk

Cool fish v SATA 120g (7200 RPM, 8 Mb cache) x 2
Golden diamond 9 PATA 80g (7200 RPM, 2 MB cache) x 2

Driver:
Intel IAAs 3.0 with raid

Motherboard

Msi 875 P Neo
DFI (diamond) lanparty pro875

PS: all onboard sound cards and network card devices are disabled

Driver:
Intel chipset software v 5.00

Operating System Windows XP Professional + SP1 English version
DirectX DirectX 1, 9.0
Software Platform
Test Software

Zdnet winmtr 99 2.0
Sisoft Sandra 2003
Hddtach 2.61

In this test, we used the soft RAID function of ICH 5 and the raid function of promise pdc20378 to build a sata raid 0 system, compared with the PATA raid 1.5 created by highpoint 372n on the lanparty pro875 board. The created SATA disk array and PATA disk array use different hard disk types, so the test results are for reference only.

Winmtr 99:

Through the winmtr 99 test, we can understand that after raid is set up, both the commercial disk performance and the high-end disk performance have been greatly improved. The raid 1.5, composed of two Gold Diamond 9, has the strongest commercial disk performance, reaching 21200, while the high-end disk performance is led by the SATA disk array controlled by ICH 5 RAID 0, the score is also quite gratifying.

To further compare the differences between PATA raid 1.5 and sata raid 0, we conducted the following comparison tests:


Highpoint PATA raid 1.5


Promise 378 sata raid 0

Through comparison of the two raid modes, we found that the raid 1.5 system created using diamondmax 9 plus (golden diamond 9 generation) Hard Disks has an astonishing performance, and its disk transmission rate can reach a maximum of 95kb/s, the average transmission rate is close to 70 kb/s. During the data transmission process, the transmission rate volatility is relatively regular, and the transmission performance is stable. As an excellent new generation disk interface, Sata also shows extraordinary strength. The RAID 0 system established by two baracuda V (cool Generation 5) has no outstanding performance of RAID 1.5 at the highest disk transmission rate, but it can also exceed 90kb/s, however, the lowest transmission rate is a certain distance higher than RAID 1.5, and the average transmission rate is better than RAID 1.5.

Sisoft Sandra 2003:

In the disk Performance Test of sisoft Sandra 2003, we used an empty disk partition as the target disk for testing, with a capacity of 20 GB. The performance of the raid 1.5 system, which is built using PATA, is not as good as that in the winmtr 99 test. It is far from the sata raid 0 composed of the two control chips.


Highpoint PATA raid 1.5


Promise sata raid 0


Ich 5 sata raid 0

HD tach 2.61:


Highpoint PATA raid 1.5


Promise sata raid 0


Ich5 sata raid 0

In the HD tach 2.61 test, raid 1.5 showed a great deal. Its average read Speed of 47.8 Mb/s allowed only 31.1 Mb/s sata raid 0 to show the title. However, in terms of CPU usage, Sata RAID 0 is much smaller than PATA raid 1.5, and the minimum value is the RAID 0 System Controlled by ich5, which is only 1.2%. However, even the raid 1.5 system with the highest CPU usage is only 2.8%, which is negligible for mainstream machine configurations that exceed 2 GHz.

In the sysmark 2002 test, we found that the performance of the entire system after the raid was set up was not significantly improved, but the office performance was also improved by 5%. The RAID 0 created by the PATA hard disk is not much different from the RAID 0 created by the SATA hard disk.

However, in the pcmark 2002 test, raid 1.5 again exceeds the RAID 0 level, which is admirable.

Summary:

Raid is a disk array technology with a long history. It was born not only to ensure the safe and long-term storage of important data in redundant disk arrays, but also to break through the bottleneck and improve performance. So far, RAID technology has been applied to various fields infiltrated by computers, and the home computer market, which is not considered by raid developers, it has been developing for several years.

In the past, only RAID 0, RAID 1, and RAID 0 + 1 were created by PATA hard disk in the home market. However, today we have witnessed the birth of RAID 1.5, we also feel the unique charm of RAID 1.5. Compared with the RAID 0 System, which is about to become a mainstream SATA hard drive, the raid 1.5 created by the PATA hard drive is not weak. On the contrary, raid 1.5 is like the most effective way to improve performance and ensure data security for the PATA hard disk.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.