A new and scalable way to protect data-erase code

Source: Internet
Author: User
Tags erasure coding

I. Overview

In previous storage systems, RAID technology was generally used to protect data, and once a hard drive in the array was damaged, the lost data could be recovered by a mirror formed by the raid technology. But with the advent of massive data problems, RAID is becoming more and more difficult to play its part. If you use a 2TB hard disk as a storage medium, a hard drive fails, it takes about 4 hours to restore it using an image, and this is the data recovery time that will be achieved with recovery as the highest priority. But in the actual situation is unlikely to occur, generally is the raid as a lower priority, in the case of idle computing resources in the background, then in this case, the time of the raid reconstruction will continue to be extended, data recovery for up to 1-2 weeks is not surprising. If there is a hard drive failure in the process of rebuilding the data, it means that the data will be lost permanently. Since RAID 5 can allow up to one hard drive in the array to be damaged, RAID 6 allows two hard drives to fail simultaneously, but in the massive data age, it is not impossible to destroy multiple hard disks at the same time, how can the data of one of the important resources of enterprise be secured?

Intel unveiled a new, scalable approach to data protection during the IDF conference-erasure coding. It works by cutting and encoding large chunks of data received by the storage system, then cutting and encoding the cut data again until the data is cut to a satisfactory chunk size, so that the data block is dispersed into multiple chunks, and then the redundancy is verified. Writes non-repeating blocks and encodings to the storage system. It is protected with traditional raid data as shown in:

      

Erase encoding extends the data protection architecture from raid 5/6 to raid k,k equal to the number of failures that can be described without causing data loss. For RAID 5, k=1; for raid 6, k=2; for erasure coding, k=n, such as in the array of 16 hard disks, using the erase encoding mechanism, even if the 6 hard disk failure, it can also recover the lost data.

As you can see, the erase encoding is currently the main target of large data blocks, Intel also applied it in its recommended large object storage system, after practical testing, the results show that compared with the traditional raid, erasure coding in all aspects have unparalleled advantages, the concrete results as shown:

    

Erasing the coded application is not only with this, in the technical course on erasure coding, the relevant technical engineer also describes the application of the erasure coding flexibly to multiple data centers, as shown in the following:

      

In general, the impact of traditional erasure coding technology on performance, especially IOPS and latency is relatively large, so the current scenario is mainly confined to archiving, cloud storage and other cold data;

A new and scalable way to protect data-erase code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.