As the world's largest Redis user, Sina has a lot of experience in development and operations. The author of this article from Sina, hope to provide some personal experience for the industry, so that everyone less detours.
Purpose of Use
From the first half of 2010 onwards, we began to try to use Redis, mainly for the following considerations.
- Performance is better than MySQL. The demand for performance is growing as business grows.
- A rich data type. Speed is the market in the Internet age, rapid development is a constant demand.
- Cache downtime is a struggle, Redis has semi-persistent and persistent two ways to solve this problem to some extent, to reduce the avalanche effect of cache downtime.
- In some business scenarios, there is a consistency problem with mysql+memcached, and if you use Redis instead, you can reduce the overall complexity of the architecture.
Perfect process
When you start to apply Redis, the scale is small, the amount of data is minimal, and there are not too many problems. And as the amount of data increases, there are many problems encountered. The bottom line is that when the data is large, problems that were not the problem have become problems.
Master/slave Synchronization Issues
The first encounter is the Master/slave synchronization problem. The principle is that slave did slaveof, send a sync,master to master to dump the memory data, form an RDB file, and then upload the file to Slave,slave to load the memory, After completion, Master sends a new packet to slave.
In the event of a network problem, such as a momentary break, the data in master will be re-transmitted. For a single port, if the amount of data is small, then this effect is not big, and if the amount of data is larger, it will cause the network instantaneous traffic explosion, while synchronizing slave do not read. We have modified it to include the concept of position to solve this problem, ensuring that all data is not re-transmitted when there is a problem with the network, and only the data behind the disconnection is re-transmitted.
AOF issue of periodic archiving
The default aof file generated by Redis requires manual bgrewrite-aof, and the lock generated by this operation has a certain effect on writing. As a result, we started this operation at the early hours of the morning when the business was low peak. As the number increases, lock time is increasingly unacceptable to the business. We have modified the source code, put the bgrewriteaof inside the Redis to implement, set the execution time in the configuration file, let this operation automatically executes, and does not cause the lock problem which writes produces.
At the same time, we have designed aof similar to MySQL's binlog, setting the size of each aof, which automatically generates a new AOF when a certain value is reached.
Design of Mytrigger and MYTRIGGERQ
The business has the need to perform a count (*) operation from the database when the application writes data by user dimension and counts the number of records (such as number of followers, fans) of the user. This relatively slow implementation in InnoDB, and the addition of the cache solution can not meet the requirements of real-time business. Therefore, we developed the Mytrigger component to read MySQL's binlog and then convert it to redis through business logic.
For example, in MySQL, each record is saved, and the sum of records in Redis is stored by the user dimension. With this implementation, the application reads the data from MySQL, reads the number of records from the Redis, and MySQL has a much lower pressure, while the count read performance improves a lot.
If the application is the writer of the data, it needs to write the data to the database, and it needs to notify the other application of these additions or changes, and another application gets these additions or updates and begins to do its own business logic processing.
At first, we used a write database and then write a copy of the Memcacheq method, and then replace the MYTRIGGERQ read MySQL binlog, the read data into a queue. Businesses that need to understand the changes in data are getting data changes by reading the MYTRIGGERQ service. This allows the application to write only once, simplifying the complexity of the application architecture.
Capacity design
We evaluate the business before applying for Redis. By filling in the expected capacity and performance requirements table, we can figure out the amount of memory that Redis occupies, ensuring that the volume of data on a single port is no higher than one-third of the machine's RAM.
Currently, we are using a 96GB memory model, with the final capacity of each port being controlled below 30GB. When the capacity of the business requirements exceeds the maximum machine memory, the split method is hash to multiple ports, and the benchmark results show that the maximum performance of 2 instances, 4 instances or 8 instances can be deployed in a machine with capacity permitting, and 20% capacity is reserved for growth, and the number of resources required is calculated based on business metrics.
After using Redis's own expiration policy, it is possible to find that data deposited into redis may occur even if there is still a large amount of memory unused, and Redis will expire the key to free up memory, or the key has not expired when the memory is low.
For outdated data, we use both cleanup and scrolling methods. Cleanup is prone to memory fragmentation, and scrolling builds two sets of ports, while writing two sets of ports. For example, to keep 3 months of data, then each break to retain 6 months of data, two simultaneous write, using odd port, at the 4th month, read and write to the even port, while cleaning up the odd port data, but the use of this method brings high maintenance costs.
Application Scenarios
Doing the cache or doing storage is a question we've been thinking about. Redis has two methods of persistence and semi-persistence, but even so, all redis data is in memory. When large data volumes are stored, the benefits of data types become less pronounced.
When the amount of data hours, you can not do too much consideration, because everything is not a problem, can use its rich data types to bring the rapid development of business and on-line, the amount of data and increase the amount of relatively controllable, the data is more granular can be used to store redis. For example, the count of user dimensions is used by Redis for storage. But for object dimensions, such as micro-blogging data, use Redis as the cache.
Some of the business's capacity has grown too fast, in contrast to previous forecasts, and all the data is in memory, there is no hot and cold distinction (the best way to reduce storage is tiered storage), and we will replace this part of the business that is no longer appropriate for Redis using new solutions. For example, replace it with a mysql+memcached way. Because each time you do a rolling switch, the cost of operation and the cost of hardware is very high, so you can use Handlersocket to replace. For example, the first 6 months of data are placed in Redis, and then the data is placed in MySQL, reducing switching while reducing operational costs.
Plans for the future
As machines grow in size, availability and automation are becoming more and more demanding, we are currently combining zookeeper to design Redis automatic switching, while improving REDIS automation maintenance requirements. We develop a high-speed data access framework and management system that puts failover, data splitting logic, and automated data migration inside to enable the product of its application. Hopefully, these paths are helpful to you in using Redis.
Author Yang Haichi, Sina's chief DBA, has extensive management experience in large-scale, high-concurrency, massive access. Passionate about overall architecture, database design, performance optimization, distributed deployment scenarios, and high availability research.
Experience on Redis