Data storage technology multiplies the storage capacity of COS technology (i)

Source: Internet
Author: User
Tags comparison cos backup

Through the overall compression technology, COS can significantly reduce the storage space occupied by data, thereby improving the storage efficiency of the physical disk.

The storage-capacity optimization technology (cos,capacity optimized Storage) in data domain can significantly reduce the capacity required to backup media by splitting and comparing data content with the principle of compression.

COS will cut data to segment (data section), through the data domain patent algorithm to analyze the characteristics of each segment, and compared with the existing data, only store new or changed data, thus greatly reducing the data storage space requirements.

The principle of Cos

COS is a backup storage solution for hard disks, the key to which is the so-called "global Compression" technology in data domain. The principle of the whole compression is similar to the traditional lossless data compression technique, which is aimed at the repetitive part of the data, removes the redundancy caused by the duplication in the data through the specific algorithm, and uses less capacity to represent the message content equivalent to the original information.

General data compression in the elimination of redundant data is discrete, the effect of removing redundant data only to the data compression. For example, if the data on a hard disk in a compressed way to backup every day, although each compression can remove the redundant part of the data, but given the general environment in the storage system of the data is not very small, so the data content of two times compression will still have a lot of duplication between.

And the special of the whole compression is that its compression effect can be expanded and "whole", unlike the traditional compression technology only for the data of the compression of the calculation. Its operational steps are as follows:

(1) Decomposing the data and finding the characteristic value

First, the data are decomposed into the segment of 4~16kb size, and the eigenvalues of each segment are obtained by a special algorithm.

(2) Analysis, Comparison of data eigenvalues

Through the comparison of eigenvalues can determine which segment data is duplicated, which does not repeat, repeat segment will be removed, the remaining segment is the basic element. In addition, an index is generated to record the constituent structure of the original segment data.

(3) Decomposing the data into basic elements

Removing repetitive segment, leaving the segment as the basic element of storage. By recording the index of the data structure and the basic elements, the system can restore the compressed data to the original data.

The principle of the preceding steps is similar to the general compression, the key point of the whole compression is the 2nd time compression. Data that is compressed for the 1th time is stored in a specific way on the hard drive, after the 2nd time of compression, the system, in addition to the subsequent compression will be derived from the eigenvalues, but also stored on the hard disk in the previous compression of the data obtained by the eigenvalues of the comparison, and left only the new data with different data. In order to improve the compression effect, the system will adjust the segment size (average 8kB) in order to obtain the best characteristic value comparison effect.

Because of the low data momentum in the general enterprise job environment, many duplicate data are stored in each backup. And the overall compression will only store changes in the data, if the original volume of uncompressed data as the benchmark, in the overall compression to perform multiple compression, the more the execution of the compression rate will be higher.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.