Thoughts on nosql: why should we optimize the storage write performance?

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Among many nosql products, what we can see through benchmark is that the write performance is greatly improved, while the read performance has not increased much or even declined from the traditional RDBMS. For example, Cassandra and MongoDB are two outstanding representatives of nosql. The reason may be that the UGC model is becoming increasingly popular, and the read/write ratio is close to or less than due to user-generated content.

But I don't think this is a real reason.

1. cache makes the storage's raw Read efficiency no longer important

The real reason is that we have done enough Optimization on reading. We use memcached, tokyotyrant/tokyocabinet and other cache storage for data storage, and we use squid and nginx proxy_cache for page and file cache, can achieve a very good read cache effect, if the real-time data requirements are not high, or the cache design is reasonable (read and write are cache), the cache hit rate will be high enough, therefore, we do not need to optimize the raw Read efficiency of the underlying storage too much.

Imagine if the cache layer has a hit rate of more than 99%, then our hundreds of millions of data read requests can easily become millions of requests compared with raw read devices, thousands of concurrent jobs can easily become dozens of concurrent jobs. Of course, this requires that our cache layer be reliable. For example, nginx proxy_cache can be used more frequently. In this case, the downtime of one server does not allow all read requests to penetrate to the underlying storage. The full execution of purge and other operations is not discussed in this article.

In summary, the raw Read efficiency does not need to be improved because its needs have been largely replaced by the cache layer.

2. unreplaceable rawwrite Functions

We can see that the cache reduces the raw read workload. We can wonder if there is any way to reduce the workload of rawwrite. The answer is no. If you think so. You can leave a message to discuss. Since the workload of rawwrite is irreplaceable, we can improve the performance of write operations in two ways.

3.1 sharding

By partitioning the data, we can store the data in a distributed manner, so each node will only be allocated to a part of rawwrite requests. This is equivalent to keeping the efficiency of the company's employees unchanged and recruiting more people. However, due to the increase of nodes, the efficiency of node problems is also greatly increased. So we had to do some replication operations to provide the HA solution.

3.2 improve rawwrite Efficiency

For the above example, we can only choose to improve the rawwrite efficiency to achieve better overall (including the cache layer) read and write efficiency. The general method used here is to serialize random write operations in the memory, and perform sequential flush to disk operations after a certain amount. This is what we mean by taking the memory as a hard disk and the hard disk as a tape. (See my earlier article: nosql theory-memory is a new hard disk and hard disk is a new tape.) therefore, we can see that many nosql products have optimized write operations, but the read performance is not significantly improved, and even do not hesitate to use slower read as the cost to improve the write operation performance.

4. Summary

Because of the read performance, you can set a reasonable cache policy to reduce the number of raw read operations. Therefore, not only do you need to optimize write operations when the read/write ratio is small, but you still need to optimize the write performance rather than the read performance when the read/write ratio is large.

Address: http://news.cnblogs.com/n/77216/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Thoughts on nosql: why should we optimize the storage write performance?

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Thoughts on nosql: why should we optimize the storage write performance?

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support