A brief talk on storage re-deletion and compression of three NetApp reverse attacks

Source: Internet
Author: User

talking about the third of storage re-deletion and compressionNetAppthe optimization

Summary: Previous issue we reviewed Hitachi Hard Drive compression and EMC's improved design in the old architecture, this issue focuses on how ill-fated Netapp can update its own re-deletion compression.

Thank you for your attention and support, welcome reprint, reproduced please indicate the source.

You are welcome to pay attention to "new_storage"



NetappHistory of re-censoring compression

NetApp has been re-deleted very early, before the year,NetApp NAS devices have been able to re-delete compression. The global market was focused on HDD storage, and NetApp's deduplication was understandable, with a lot of warm-cold data stored on NetApp , so deduplication is an added value feature. But unfortunately, this feature does not help NetApp to get how many markets, but the unified storage of this architecture in the 2010~2014 year in the world, as a sign of midrange storage is very surprising.

Netapp ONTAP 8.3 the previous implementation

Netapp's old compression implementation is based on the principle of compression, so the compression scope is set to 32KB, then compression, if the compression rate is less than 25%, that is , the compressed block is still greater than 24K It will give up compression. The minimum compression can be compressed to 4kB. Compression is continued for non-compressible blocks in background compression. For iOS less than 32KB, online compression skips him.

at the same time,NetApp supports post-processing compression.

for the deduplication implementation,the old NetApp version only supports the post-deduplication, but the fingerprint is calculated when the data is dropped in real time, and then added to a log file that records the fingerprint.

When the background re-delete is triggered, the re-delete process is initiated. Then perform the same operation as the traditional deduplication, fingerprint lookup, repeat the byte-by-bit ratio, and then the data delete operation.

Older versions of Netapp are more restrictive:

1, compression must be used in conjunction with the re-deletion, cannot use the compression function alone

2, only after the use of compression and re-delete, to turn on online compression.

3, within the same storage, up to 8 volumes simultaneously turn on the re-delete or compress

As you can see from these three limitations,NetApp is very conservative in its ability to compress and re-delete. This is also noted in the White paper.

Netapp points out that when 1 LUNs are re-deleted, the performance impact on the15% , but 8 LUNs simultaneously turn on the re-delete effect is 15%~50% .

for the compression of online processing, if the system load below 50%, turn on compression CPU utilization will rise about 20%, for more than 50% business pressure system, NetApp does not recommend using the online compression feature.

post-compression and post-re-delete mechanisms for It remains to be seen whether SSD- based All-flash storage does not know if an excessive amount of writes will lead to fast SSD wear.

new version Quick optimization

Netapp's new version of the main optimization in the compression, discarded the original compression architecture, because the original compression is mainly used in the file system, the compression domain is relatively large, requires multiple data blocks for merging compression. Merging compression means that each quick need to wait for multiple fast merges together to compress.

NetApp 's san storage block size is 4KB, so the new version of NetApp compresses in 4KB , compressing it into small chunks of 1k~4kb. Compression effect or there is a pre-judgment mechanism, compression rate "50% will be compressed, otherwise not compressed." This change makes compression more efficient, but the same compression rate is low.

In order to improve the compression ratio,NetApp has added a new feature in this version, called domestic compression. Combine multiple small blocks into a 4KB block to store them. This operation is very effective, after all, each fast meta-data only stores one address and offset, compression does not increase the cost of any metadata, just need to refresh the original metadata.

Such a simple operation solves a lot of empty problems, and also improves resource utilization. However , because of the reduced compression rate of the NetApp compression mechanism itself (the compression domain is too small), NetApp itself compresses the compression rate and improves performance. What is desirable is the handling of compression. The compaction process also handles a number of blocks that have not been compressed before, so there is a certain performance cost for the compaction, which is probably around 5% according to NetApp .

Netapp 's large-scale modification is mainly due to the storage mechanism of the metadata fingerprint and to support two kinds of online processing and post-processing.

the granularity of the re-deletion is as old as 4KB, but the storage mechanism has changed.

1,                 All fingerprint data is stored only in the cache, not persisted , no need to consider the image, during the restart and upgrade process, the fingerprint will be discarded. The reason for this is that we would rather sacrifice the re-deletion rate than sacrificing performance. Then we can drastically reduce the memory overhead of fingerprint and disk interactions and fingerprint persistence, mirroring overhead, and so on. Therefore, I would like to realize the re-deletion of domestic manufacturers, my suggestion is: the deletion of the fingerprint does not need to persist, do not need to mirror, if the upgrade or restart, it is recommended to save a copy of the hard disk, but do not need to live down the disk. It only needs to be in the current operation such as power-down and restart. netapp I don't think it's a good idea to restart the upgrade and the fingerprint protection at the time of the power-down.

2, fingerprint According to the heat of elimination , once the fingerprint space is full, the simplest LRU algorithm to eliminate. This is also a way to keep the fingerprint search efficient and can guarantee the fingerprint full memory. This is also worth learning.

Of course, the above two for the fingerprint of the transformation in fact there is a need to decouple things in advance. Many manufacturers in the implementation of the re-deletion of the reference count and fingerprint stored together, it needs to be decoupled to better achieve.

This also allows us to rethink the subsequent data structure design requires data to be separated by the persistence level , rather than merging according to the data's associativity. Keep your architecture flexible at all times.

of course,NetApp can actually do a little better, such as caching the hottest part of the fingerprint table to L1 cache, and changing the data structure to hash, the impact on performance will be smaller, the rest of the fingerprint is placed in L2 Cache then uses B -Tree storage to save space. This can make the re-deletion more flexible, if the business pressure is large only in the L1 Cache to find.

Aside , there are a lot of concerns about the current customer's re-deletion, and many people think that the risk of data loss will result from the deduplication. So we inevitably need to do byte-by-bit comparison, in which caseNetApp gives us a good example of using weak hashes , saving the fingerprint table space, and enlarging the number of fingerprints. This is also a desirable point. The future of a vendor that uses a strong hash not to do the right thing will not be accepted in the market at all.

NetappAll-Flash comeback

NetApp was a laggard in the SAN domain before the year and was a complete loser in the market-wide area. But through 2016/2017 years of continuous optimization, unexpectedly turn over, very incredible.

in fact, he only did the right thing, the compression of the re-deletion realized, while the performance of the turn-off after the fall within the controllable range (5%~20%). In this way, NetApp is now bucking the trend in all-flash markets.

So the re-deletion of compression or do not do well, the future will go to the mire.

the re-deletion of the compression is a icing on the cake, definitely not the snow to send carbon. it market writing ability is the natural enemy of technology. In addition, the marketization of technology also requires a long gestation, at least in China to accept the re-deletion of compression can be less than the capacity of the things will take a long time.


Thank you for your attention and support, welcome reprint, reproduced please indicate the source.

You are welcome to pay attention to "new_storage"





A brief talk on storage re-deletion and compression of three NetApp reverse attacks

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.