The rapid development of micro-blogging business, the requirements of the infrastructure level is also increasingly high. Sina as the earliest use of Redis, and is the largest user of Redis, in the use of Redis, is also gradually optimized and improved.
As an important data in Weibo, the proportion and importance of the counting business in the microblog business is gradually increasing. The accuracy of the count results directly affects the user experience and can easily lead to user complaints. In the counting business, in the continuous optimization and improvement, we have mainly experienced the following three stages:
Primary Stage
Starting from 2010, use the redis-2.0 version. When the initial business data is relatively small, performance is quite good. But as the volume of data and the volume of requests continues to increase, some problems are gradually exposed.
Master-Slave synchronization problem
The first encounter is the master-slave synchronization problem. It is the principle that when master receives the slave synchronization request, the memory data fork out a sub-process dump out, form an RDB file, and then upload to Slave,slave and then load the file into memory, The subsequent incremental update is synchronized to slave by Master as soon as it finishes executing each modification command. In the event of network problems, such as transient, it will cause the slave data to be re-transmitted. For a single port, if the amount of data is small, this effect is not significant, and if the amount of data is large, it will cause the network traffic to burst, and slave can not respond to any requests when loading the RDB.
Persistence issues
Most of the counting business uses Redis as storage, so aof is turned on and configured to do a Fsync operation per second to flush the write operation to disk. With the growth of aof, regular rewrite is needed. The mechanism of rewrite is similar to the process of generating an RDB, which is done by fork out a child process, which causes the persistent write of the disk to block the Fsync operation of the parent process, causing a large number of requests to time out.
Version upgrade issues
Because Redis has a lot of initial bug fixes, version iterations are frequent, and version upgrades need to close the Redis process and reload aof. For a large number of micro-blogging businesses using Redis, the cost of such upgrades is becoming increasingly unbearable.
Memory usage Issues
2.0 version of Redis, in memory use relatively extensive, for the count of such a simple key-value, occupy more than 100 bytes of memory, there is more space for optimization.
Advanced Stage
We have improved one by one for the initial problems of Redis use. Master-slave replication reference MySQL synchronization mode, the use of rdb+aof combination of methods to solve the network transient caused by the retransmission problem, while restricting the sub-process to do the background dump of the disk write, during the suspension of the main process Fsync operation, resolved the problem of slow request.
For the counting business, we developed a dedicated version of Redisscounter, the length of the memory key occupied by a single Key-value plus 4 bytes of value, reducing the memory usage below the original 1/4. By pre-allocating memory arrays and double hashing techniques, a large amount of pointer overhead for a hash table in Redis is eliminated.
For the issue of version upgrade, we encapsulate the core processing logic of Redis into the dynamic library, the data in memory is stored in the global variable, and the corresponding function in the dynamic library is called by the external program to read and write the data. Version upgrades only need to replace the new dynamic library files, no need to re-load data. In this way, a version upgrade requires only one instruction to complete the upgrade of the code at the millisecond level without any impact on the client request.
With the above improvements, the new version starts to be heavily applied, and most businesses can replace the previous mysql+memcached combination as complete storage. The number of comments and the number of tweets on Weibo, due to the increase in microblog entries, unable to save the full amount of data, so the use of mysql+redisscounter combination, MySQL to save the full amount of data, using two sets of Redisscounter to save the last few months of hot data, Cleans up cold data by periodically scrolling through the data in two sets of Redisscounter.
Advanced Stage
With the development of micro-BO, the count for a single micro-blog has been increasing, from the original comments, the number of forwards, and increased the number of statements, 2013 also on the line of reading. Redisscounter does not solve this kind of extension problem very well, at the same time the above mysql+redisscounter scrolling way is too complex, the regular scrolling operation is prone to problems. In response to these problems, we have again made improvements to the key from the original string to the micro-Bo ID, and for each microblog comment forwarding count, we found that most of the microblog count can be saved with 10~15 bit, so you can save multiple counts into a 4-byte value, An excessive count value creates another space in memory to save. This gives you a get command to get all the counts for that microblog. At the same time, for the characteristics of the microblog business, the older the number of micro-blog visits will be less, in memory using multiple arrays to save a different range of micro-bo, memory shortage will be the oldest group of micro-Bo dump to SSD, internal automatic rolling can ensure that the hot microblogging all in memory. Access to old data that falls on SSDs is read and written through asynchronous IO threads, which, after this improvement, removes the original MySQL storage, reducing business development costs and operational costs.
From the development experience of Redis in counting business, it can be seen that the progress of technology is driven by the demand of business. With the development of the business, you will encounter more new challenges. Hopefully, these improvements will help readers in the process of using Redis.
Redis count on Sina Weibo