The application of Redis in Sina Weibo
About Redis
1. Support 5 kinds of data structure
Support strings, hashes, lists, sets, sorted sets
String is a good way to store a count. Sets is great for building index libraries;
2. K-v Storage vs K-v Cache
Sina Weibo currently uses 98% of the applications are persistent, 2% is the cache, the use of the 600+ server
There are no significant differences between persistent and non-persistent applications in Redis:
The non-persistent is 890,000 TPS, then the persistence is around 780,000 TPS;
When using persistence, it is necessary to take into account the ratio of persistence and write performance, that is, to consider the size of the memory used by Redis and the ratio of the write rate of the hard disk;
3. Community Activity
Redis currently has more than 30,000 lines of code, code writing is streamlined, there are many clever implementations, the author has a technical cleanliness
Redis community activity is very high, this is an important indicator of the quality of open source software, open source software in the early days there is no commercial technical service support, if there is no active community to support, once the problem is nowhere to help;
Redis Fundamentals
Redis Persistence (aof) Append online file:
Write log (AOF) to a certain extent and then merge with the memory. Append and append, sequential write disk, with very little performance impact
1. Single Instance single process
Redis uses a single process, so when configured, an instance will only use one CPU;
When configured, if you need to maximize CPU utilization, you can configure the number of Redis instances to correspond to the number of CPUs, the number of Redis instances corresponding to the number of ports (8-core CPU, 8 instances, 8 ports) to increase concurrency:
When single-machine test, individual data in 200 bytes, the test result is 8~9 TPS;
2. Replication
Procedure: Data is written to Master-->master stored in an RDB in slave-->slave loads the RDB into memory.
Save point: When the network is interrupted, continue to pass.
Master-slave The first synchronization is Quan Zhuan, followed by incremental synchronization;
3. Data consistency
There is a possibility of inconsistency between multiple nodes after long-term operation;
Development of two tool programs:
1. For data with large amount of data, will be periodic full-volume inspection;
2. Check the increment data in real time, whether it is consistent;
For the main library is not timely synchronization from the library caused by the inconsistency, called the delay problem;
For a scenario where conformance requirements are not so stringent, we only need to ensure eventual consistency;
For the delay problem, we need to solve this problem by adding the strategy from the application level according to the characteristics of the business scenario.
For example:
1. New registered users must first query the main library;
2. After the successful registration, you need to wait for 3s to jump after the background at this time is doing data synchronization.
Sina Redis Usage History
2009, using Memcache (for non-persisted content), memcachedb (for Persistence + counting),
Memcachedb is Sina on the basis of memcache, using BerkeleyDB as data persistence storage realization;
1. The problems facing
- Data Structure more and more, but not in memcache, affecting development efficiency
- Performance requirements, as the amount of read operations need to be resolved, the process experienced is:
Database read-write separation (M/s)-database uses multiple slave--> to increase cache (memcache)-Go to Redis
- To solve the writing problem:
Horizontal split, split the table, will have the user in this table, some users placed in another table;
Reliability requirements
The "avalanche" problem with the cache is a tangled one.
Cache faces the challenge of rapid recovery
Development cost Requirements
Consistent maintenance costs for cache and DB are getting higher (clear db, clean cache, no, it's too slow!)
Development needs to keep up with the influx of product requirements
The most expensive hardware cost is the database level of the machine, basically more expensive than the front-end machine several times, mainly IO-intensive, very consumption of hardware;
Complex maintenance
Consistent maintenance costs are increasingly high;
BerkeleyDB use B-tree, will always write new, there will not be a file re-organization, which will lead to larger files, the big time to file archiving, archiving operations to be done on a regular basis;
In this way, it is necessary to have a certain down time;
Based on the above considerations, Redis is selected
2. How to find open source software and judging criteria
- For open source software, first look at what it can do, but more need to focus on what it can't do, what's the problem?
- After the rise to a certain scale, what may be the problem, can it be accepted?
- Google Code, Foreign forum to find materials (domestic technical level lags behind 5 years)
- Observe the author's personal code level
Redis Application Scenarios
1. How to use your business
- Hash sets: Watchlist, fan list, bi-directional watchlist (key-value (field), sort)
- String (counter): Weibo, number of fans, ... (Avoids the Select count (*) from ...)
- Sort sets (Auto sort): TopN, popular Weibo, etc., auto sort
- Lists (queue): Push/sub reminder,...
The above four, from the refinement of control, hash sets and string (counter) recommended to use, sort sets and lists (queue) is not recommended
It can also be streamlined by two development cycles. For example: storage character changed to store shaping, 1.6 billion data, only need 16G memory
The storage type is kept within 3 types, and it is recommended not to exceed 3 kinds;
Replace the Memcache +myaql with Redis:
Redis as storage and provide queries, the background is no longer using MySQL, to solve the problem of consistency between multiple data;
2. Storage of Big Data tables
(Storage of eg:140 word Weibo)
A library has a unique ID and 140 characters;
Another inventory ID and user name, release date, number of hits and other information, used to calculate, sort, etc., and so on to calculate the last need to show the data to the first library to extract the content of Weibo;
3 Steps to improve:
1) Identify problems with existing systems;
2) found new things, how to see how good, a comprehensive turn to new things;
3) Rational regression, to determine what is suitable for new things, which are not suitable, inappropriate to return to the old system
3. Some tips
- Many applications that can withstand database connection failures, but cannot withstand slow processing
- One piece of data, multiple indexes (for different query scenarios)
- The only way to solve the IO bottleneck: with memory
- In the case of small changes in the volume of data, the first choice of Redis
Problems encountered and their solutions
(Note: All is very big time will appear, the quantity is small how all say)
after the 1.problem:replication interrupt, re-sends the network burst
Solution: Rewrite replication code, RDB+AOF (scroll)
2.Problem: Capacity Issues
Solution: Capacity Planning and M/S sharding function (share nothing, the association data between abstracted data objects is very small)
Add some configuration, shunt, such as: 1,2,3,4, machine 1 processing%2=1, Machine 2 processing%2=0.
Less than 1/2 of the amount of memory used, otherwise the expansion (recommended for Redis instance data, not more than 80% of the memory)
Our on-line 96g/128g memory server does not recommend a single instance capacity greater than 20/30g.
The highest single-table data in microblog applications has 2T of data, but it has been a bit of an inadequate application.
Do not exceed 20G per port, the time it takes to test the disk for save, the amount of time it takes to write it all, the larger the memory, the longer the write time;
Single-instance memory capacity is large, the immediate problem is the failure recovery or rebuild from the library time is longer, for the load speed of ordinary hard disk, our experience is generally redis load 1G takes 1 minutes, (loading speed depends on the size of the data and the complexity of the data)
Redis rewrite aof and save Rdb will bring very large and long system pressure and take up additional memory, which is likely to lead to low system memory, which can severely affect performance on-line failures.
Reblance: The existing data is redistributed according to the above configuration.
The latter uses the middle layer, routing ha;
Note: Currently the official is also doing this, Redis Cluster, to solve the HA problem;
3. Problem:bgsave or bgwriteaof ice crystal problem
Solution: Disk performance planning and limiting the speed of writes, such as: Specifies that the disk is written at 200m/s speed, even if a large amount of data arrives. Note, however, that the write speed satisfies two objective limitations:
Match disk speed
Meet the time limit (guaranteed to finish before the peak arrives)
4.Problem: operation and maintenance issues
1) Inner Crontab: Migrate Crontab to Redis to reduce stress during migration
The local multi-port avoids doing at the same time-can do
Same business multi-port (distributed on multiple machines), avoid simultaneous-do not
2) Dynamic Upgrade: First load. So file, then manage configuration, switch to new code (Config set command)
Package the Redis improvements into a lib.so file to support dynamic upgrades
When you change yourself, consider the escalation of the community. When the community has a new version, there are very useful new features, to be able to easily with our improved version of the good merge;
Prerequisites for Upgrade: Modular, module-by-unit upgrade
The load time depends on two aspects: Data size, complexity of structure. Typically, 40G data takes 40 minutes
Two core problems of distributed system: A. Routing problem b.ha problem
3) Handling of dangerous orders: for example: Fresh all Delete all data, you have to control
Operations can not only talk about data backup, but also to consider the time required for data recovery;
Increase Authority authentication (administrator only) Eg:flashall permission authentication, must have the password to do;
Of course, high-speed data interaction is generally not in each of the authority authentication, the general processing strategy is the first certification, the latter do not have to re-certification;
Control hash policy (cannot find value without key, cannot get key without knowing the hash policy)
4) Config Dump:
In-memory configuration items are dynamically modified and written to disk according to a certain policy (Redis supported)
5) Bgsave bring AoF writes very slowly:
Fdatasync do Bgsave, do not do sync aof (there will be data access)
6) Cost Problem: (22T memory, 10T is used to count)
Redisscounter (1.6 billion data takes up 16G of memory)-all changed to integer storage, others (string, etc.) all do not
REDIS+SSD (counterservice count service)
Sequential self-increment, table written in sequence, full 10 table is automatically landed (to SSD)
Storage Rating: Memory allocation problem, 10K and 100K write to a piece, there will be fragments. Sina has been optimized to a waste of only 5% (already good!)
5.Problem: Distributed issues
1.Config Server: namespaces, particularly large to tell access, are not suitable for proxies, because agents slow down, however, Sina (single-Port, Redis Cluster, Sentinel)
Config server put on zookeeper
The first is the naming service, followed by a stateless twmemproxy (Twitter's improved, written in C), followed by Redis;
2.twmemproxy
The application does not care about the connection failure, the agent is responsible for reconnection
Put the hash algorithm on the agent
Agent behind the upgrade, the front end does not care, solves the problem of HA
Stateless, more than one agent does not matter
3.AS-, Proxy-->redis
4.Sina Redis is a standalone version, and redis-cluster interactions are too complex to use
Do ha, you must cooperate with the monitoring to do, if hung after the follow-up how to do;
Not the pursuit of single-machine performance, but the throughput of the cluster, which can support wireless expansion;
Experience Summary
- Plan your data volumes ahead of time, reduce sharding (Internet companies generally in years)
- Only fine data (memory is Jin GUI!)
- Storing data for a user dimension
The data of the object dimension must have a life cycle
Especially when the amount of data is particularly large, it is necessary to divide it;
- Common processes for exposing services: ip--> load Balancing--and domain-to-Name Service (table: Name + resource (ip+ port))
- In the case of hardware consumption, the Redis consumes CPU more than IO, network and CPU, and the complex data type must bring CPU consumption.
- Sina Weibo response time timeout is currently set to 5s; (return very slow record key, need to log down analysis, slow log);
- The data to be backed up should be run on a regular basis; used to check the validity of backup data;
- Slave hang more will certainly have a comparative effect on master; Sina Weibo currently uses M/s is a drag one, mainly used to do disaster tolerance;
When synchronizing, it is to fork out a separate process to synchronize with the slave, which will not occupy the query process;
- Upgrade to 2.6.30 after the Linux kernel;
In 2.6. More than 30 of the problem of soft interrupt processing is very good, performance improvement effect is obvious, almost 15% to 30% of the gap;
- Redis does not have to read and write separate, each request is a single thread, why read and write separation.
The application of Redis in Sina Weibo