Six methods for data reduction in youdao primary storage (1)

Source: Internet
Author: User

It has become a standard data reduction feature for many backup and archiving products and is becoming increasingly popular on primary storage. The driving force of this phenomenon is quantifiable cost savings, from having to buying fewer disks to reducing annual support costs, to reducing storage management-related operation costs. Data reduction also has a gratifying impact on storage performance: by reducing the use of inactive data for expensive high-performance storage, the performance of the entire storage and application system can be greatly improved.

In typical enterprises, according to the research of the Storage Network Industry Association SNIA, 80% of files stored on primary storage were not accessed in the last 30 days. The same report also states that, the growth of inactive data is four times that of active data. Considering these facts, it is not surprising that the data reduction technology has started to enter the primary storage field.

However, compared with the data reduction methods used by backup and archive systems, the primary storage system cannot tolerate any impact on performance and reliability. This is the most relevant attribute of the primary storage system. As a result, the data reduction technology has changed, and different methods have been used in primary storage and backup and archiving systems. In backup and archiving systems, deduplication and compression are the main means of data reduction. For primary storage systems, those technologies are obviously more sensitive, and does not affect performance like deduplication or compression. These main data reduction technologies are being applied to the primary storage system: select the appropriate RAID level, automatically streamline configuration, efficient cloning, automatic storage Tiering, deduplication, and compression.

Select an appropriate RAID level

It seems strange to put "select the proper RAID level" at the top of the Data Reduction Technology list first, and unlike other data reduction methods, this is only one option that can be used by all storage systems, but it has a great impact on disk requirements, performance, and reliability. If you do not consider the reliability defect, RAID 0 is the most cost-effective and high-performance option for block-level strip across all Disks without verification or mirroring, however, a single disk failure will lead to the loss of data in the entire RAID group, making it difficult to log on to the data center. On the other hand, RAID 1 image, unverified or strip) and RAID 10 strip disk image), combined with high performance and high reliability, but requires twice the disk capacity, this is also the opposite of data reduction. RAID 5-level striping, distributed verification) although an additional disk is required, it has become the best compromise in recent years, but as the disk capacity increases, reconstruction takes longer and longer. After a single disk failure and RAID group reconstruction, the risk of losing two disks has increased to an uncomfortable but unacceptable level. Therefore, storage vendors use RAID 6 and add an additional verification disk to RAID 5, so that it can withstand two disk failures without data loss-but with different levels of performance impact, this is related to the implementation. When purchasing a new storage system, the performance indicators of RAID 6 and RAID 6 are all considerations.

"Unlike most of our competitors, we can use RAID 6 technology with only 5% additional overhead)," said Larry Freeman, a senior storage technology expert at NetApp.

Automatic Configuration streamlining

Until recently, there was still no real alternative to the existing on-demand storage products, so the storage utilization has not been high. This is common when there are hundreds of GB of allocated but not used storage in the company's data center. "Before we use Conway's disk array and auto-streamlining configuration technology, we rely on users to help us estimate the storage needs, and we estimate an increase of 20% to 100% for each user, it depends on what kind of application system it is, "Brandon Jackson, CIO in Galston, North Carolina, describes the process of being used by many companies to ensure adequate storage capacity is unscientific and wasted.

The automatic streamlining configuration technology allows the storage system to allocate physical capacity as required to end the waste management of storage resources. Storage is allocated to the volume on demand. For example, a GB volume can be allocated with a streamlined configuration, although it only has 10 Gb of physical storage. The automatic streamlining configuration is transparent to users and users will see a GB volume. The cost saved by streamlined supply may be huge, and the storage utilization rate exceeds 90%.

Vendors that support automatic and streamlined configuration are growing rapidly. This has become one of the key criteria for selecting a storage system. However, remember that not all the implementations of auto-streamlining configuration are the same. Some systems need to set separate regions for the automatic streamlining configuration, while all other capacities can be used for automatic streamlining configuration without special reservation. The function of converting a "thick" volume to a "thin" volume, how to restore unused storage, and how to automatically streamline the configuration permission are different. With the increasing number of automatically streamlined configuration and storage, the depletion of physical storage has become a common risk in the automatically streamlined configuration environment. Therefore, alarms, notifications, and storage analysis become necessary functions. Compared with traditional environments, they play a greater role in an environment with streamlined configurations.

Efficient cloning

Clone is used to create a completely identical copy of an existing volume, which is more suitable for server virtualization and is often used to clone a virtual operating system volume. The most basic and major implementation of cloning is to create a full copy of the source volume, which occupies the same physical storage as the source volume.

The further upgrade function is to clone the automatically streamlined configuration volume. Some storage systems will convert the volume to a traditional volume during cloning, while others can create a clone of the volume. The source volume and the cloned volume need to be allocated the same physical storage. "Our Virtual Storage PlatformVSP) is capable of creating a streamlined clone volume through a streamlined volume," said Mike Nils, Senior Product marketing manager at the enterprise platform Department of Hitachi Data Systems.

The most effective cloning is to streamline the cloning. The cloning volume does not retain data at all, but is based on the original image. To streamline cloning, you only need to save the differences between the original image and the cloned image, which saves a lot of disk space. In other words, a New Clone requires the least physical disk space, and will be saved only when it is different from the clone change of the source image. The clone function of NetApp FlexClone and Oracle ZFS Storage ApplianceSun ZFS Storage 7000 Series) is a Storage system that supports simplified cloning.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.