New RAID technology

Source: Internet
Author: User

Overview

As the disk capacity increases, data storage reliability at the RAID level is greatly threatened. The most prominent problem is that the disk reconstruction time is greatly increased. For raid 6, the probability of three disks being damaged is also greatly increased. At the same time, the application performance is also greatly reduced during disk restructuring. This is the most difficult problem encountered by traditional RAID. Many storage vendors are trying to solve the problems encountered by RAID. I think there are basically two solutions to the RAID problems. One is to add redundant information for the data, just like the transition from RAID 5 to RAID 6 in the past, this can enhance the fault tolerance capability of the number of bad disks. Another method is to change the existing RAID architecture to make the RAID architecture more flexible, thus reducing the RAID Data Reconstruction time and reducing the possibility of RAID failure. Here, I have investigated several new RAID types on the market and summarized them here.

Cool blue whale's Smart RAID

The storage center of the Institute of Computing Technology of the Chinese Emy of Sciences is the core team of storage technology in China, and its industrialization company is chinacache. A relatively novel disk array was launched some time ago, which is specially used in the video surveillance field. This RAID is called smart RAID. Shows the RAID:

650) this. width = 650; "title =" 1.jpg" src = "http://www.bkjia.com/uploads/allimg/131228/0036231231-0.jpg"/>

From this figure, we can see that the core idea of RAID is to split a large RAID into several small RAID. When an error block is found in a small RAID, you only need to reconstruct this small RAID, instead of re-constructing all the data on the entire disk. This obviously brings the biggest benefit, reducing the reconstruction time. In fact, this method has a theoretical basis. It is found that most disks are damaged rather than the whole disk, and most of the cases are due to problems in some disks. Based on this discovery, we can use partial refactoring to replace the whole disk refactoring. Therefore, the original RAID of this solution is divided into several small RAID. It seems that this solution is perfect, but there are still some problems. What if the entire disk is damaged? What if the reserved idle area is not enough? What if there are bad blocks in the idle area? If there are several small RAID data restructures at the same time, what will happen to the performance? Therefore, I personally think this solution is not flexible enough. Because the disk array of chinacache is used in the video surveillance field, it does not have high requirements on data reliability, but it has high requirements on the Write Performance of continuous data streams, this solution is more suitable for video monitoring applications than the traditional RAID 5 solution.

Quantum Dynamic Disk Pool

Quantum has a RAID technology different from traditional RAID, which is called a Dynamic Disk Pool. Its core idea can be described.

650) this. width = 650; "title =" 2.jpg" src = "http://www.bkjia.com/uploads/allimg/131228/00362310E-1.jpg"/>

As shown in the figure, the cool blocks of different colors constitute a small RAID, which is similar to the advantages of chinacache, that is, using partial blocks to build a traditional RAID. This idea is based on the new bad disk model. The difference is that each small RAID will have its own Spare D-Piece, and for different small RAID, these Spare D-Piece will be distributed on different disks, that is, the idea of distributed idle blocks. The advantage of this approach is obvious. If multiple small RAID data is reconstructed at the same time, all the reconstructed data will be distributed to multiple disks, instead of concentrated on several fixed disks. This solution not only utilizes the idea of local reconstruction, but also distributes idle blocks to obtain the maximum reconstruction performance, thus reducing the reconstruction time.

The problem with this solution is that once a disk is removed from RAID and a new disk is added, the idle disk will soon be concentrated on the new disk, distributed idle space does not exist as the disk is replaced. As shown in, the preceding figure replaces the previous data distribution, and the following figure shows the situation after the data is reconstructed and the new disk is added.

650) this. width = 650; "title =" 3.jpg" src = "http://www.bkjia.com/uploads/allimg/131228/0036234459-2.jpg"/>

To solve the above problems, some solutions will consider some data migration operations to obtain the benefits of distributed free space, but will affect the system performance.

NetApp DDP

The new RAID technology of NetApp is also called the Dynamic Disk Pool, and it is similar to Quantum in many terms. However, from NetApp's technical White Paper, the difference between NetApp's DDP and Quantum's DDP is still very big. Quantum's DDP is relatively fixed in terms of data layout, which is consistent with the chinacache solution. From the technical description of NetApp, the data layout on the disk is not fixed and flexible. NetApp DDP restores data when the entire disk is damaged.

650) this. width = 650; "title =" 4.jpg" src = "http://www.bkjia.com/uploads/allimg/131228/0036233U5-3.jpg"/>

From this figure, we can see that once the disk is damaged, the data on the faulty disk will be reconstructed to another disk that runs well. This type of disk should have algorithms. It is said that the pseudo random disk selection algorithm is used. From the brief description in the White Paper, I can understand that NetApp should adopt a technology similar to storage virtualization. In this way, the entire architecture will become very flexible, and the data reliability will be much higher than the existing RAID.

IBM Declustered RAID

IBM's Declustered RAID technology is a non-traditional RAID technology applied in its GPFS. Declustered RAID technology is used to distribute data on disks. We can see that RAID is based on disk blocks rather than on the entire disk. This is an important difference from traditional RAID. In addition, the distributed and random data on disks does not know whether virtualization technology is used? The virtualization technology is used to manage the physical data blocks on the disk, obtain the physical blocks through resource allocation, and build RAID based on these physical blocks.

650) this. width = 650; "title =" 5.jpg" src = "http://www.bkjia.com/uploads/allimg/131228/00362351G-4.jpg"/>

IBM's Declustered RAID can distribute idle resources to improve data refactoring performance and reduce data refactoring time.

Panasas Object Store

The core idea of Panasas's Object Store is to implement RAID semantics at the file system layer. This method calculates the verification information for several storage objects in the file system, such as blocks, and then writes these storage objects to different storage devices, which can be common disks. Once a problem occurs on one of the storage devices, the lost data can be restored through the redundant information of the storage objects. This method can completely avoid data recovery from the entire disk. In addition, because RAID is implemented at the file system level, in the worst case, only all files need to be restored, rather than the entire disk. This is the data Reconstruction Method of data aware, this reduces the Data Reconstruction time.

Summary

The above analyses several valuable RAID new technologies. We can see that these RAID new technologies share the same feature, that is, building RAID at the block level rather than creating RAID on the entire disk. Local disk damage has become a consensus in the industry. Therefore, we have optimized the RAID architecture on this basis. Where will RAID technology go in the future? I personally think that DDP similar to IBM's declustered RAID and NetApp will be an important direction. In addition, building an Object Store in a file system using RAID is also a trend. The two methods seem to be far different. In fact, from a technical perspective, the two are essentially the same. From the application point of view, the file system RAID and DDP will be a good complement.

 

This article is from the "Storage path" blog. For more information, contact the author!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.