Redis cluster technology and Codis practices
Preface
This article mainly discusses Redis cluster related technologies and new developments, Redis O & M and other content, and will be discussed in another topic in the future.
This article mainly recommends Codis, the open-source Redis distributed middleware for pods (this project was open-source on GitHub four months ago and has exceeded 2100 star ). Compared with Twemproxy, it has many exciting new features and supports seamless migration from Twemproxy to Codis.
Okay, let's get started.
1. Common Redis cluster technologies
For a long time, Redis itself only supports a single instance, with a maximum memory of 10 ~ 20 GB. This cannot meet the needs of large online business systems. It also causes low resource utilization-after all, the server memory is usually 100 ~ 200 GB.
To solve the problem of insufficient single-host carrying capacity, major Internet companies have made a "self-help" Implementation of the cluster mechanism. In these unofficial cluster solutions, the data "sharding" is physically stored in multiple Redis instances. Generally, each "slice" is a Redis instance.
There are three implementation mechanisms, including the officially launched Redis Cluster and Redis Cluster, which are described as follows. We hope this will be helpful for your selection.
1.1 client sharding
This scheme puts the sharding work on the business program end, and the program code directly performs distributed access to multiple Redis instances according to the pre-configured routing rules. The advantage is that, without relying on third-party distributed middleware, the implementation methods and code are controlled by themselves and can be adjusted at any time, so you don't have to worry about stepping into the trap.
This is actually a static sharding technology. To increase or decrease Redis instances, you must manually adjust the sharding program. Open-source products based on this sharding mechanism are still rare.
The performance of this sharding mechanism is better than that of the proxy mechanism (less than an intermediate distribution link ). However, the disadvantage is that the upgrade is troublesome and the individual dependency on R & D personnel is strong-a strong program development capability is needed to back up. If the main programmer leaves, the new owner may choose to rewrite it.
Therefore, in this way, O & M is poor. When a fault occurs, R & D and O & M are required to locate and solve the fault, resulting in longer fault time.
This solution is difficult to perform standardized O & M and is not suitable for small and medium enterprises (unless there is enough DevOPS ).
1.2 proxy sharding
In this solution, the sharding work is handed over to a dedicated proxy program. The proxy receives data requests from the business program. Based on the routing rules, the requests are distributed to the correct Redis instance and returned to the business program.
Under this mechanism, third-party proxies (rather than self-developed) are generally used. Because there are multiple Redis instances at the backend, such programs are also called distributed middleware.
The advantage is that the business program does not need to care about the backend Redis instance, so it is easy to operate and maintain. Although this may cause some performance loss, Redis memory-read-write applications are relatively tolerable.
This is our recommended cluster implementation solution. Twemproxy, an open-source product based on this mechanism, is one of the representatives and is widely used.
Install and test Redis in Ubuntu 14.04
Redis cluster details
Install Redis in Ubuntu 12.10 (graphic explanation) + Jedis to connect to Redis
Redis series-installation, deployment, and maintenance
Install Redis in CentOS 6.3
Learning notes on Redis installation and deployment
Redis. conf
1.3 Redis Cluster
In this mechanism, there is no central node (an important difference from the proxy mode ). Therefore, all happy and unhappy things will be based on this.
Redis Cluster maps all keys to 16384 slots. Each Redis instance in the Cluster is responsible for a portion of the keys, and the business program operates through the Integrated Redis Cluster client. The client can send a request to any instance. If the required data is not in the instance, the instance directs the client to read and write data to the corresponding instance automatically.
Redis Cluster member management (node name, IP address, port, status, role) and so on, all through the two nodes communication, regular exchange and update.
It can be seen that this is a very "heavy" solution. It is no longer a simple and reliable Redis single instance. This may be one of the reasons for the recent release after the delay for many years.
This reminds us of a history. Because Memcache does not support persistence, someone wrote a Membase, and later renamed it Couchbase, saying that it supports Auto Rebalance. It has been used by many companies for several years.
This is a worrying solution. Oracle RAC also uses a storage device space to solve cluster management issues such as arbitration. Redis Cluster is completely decentralized ......
This solution is not recommended currently. From the perspective of understanding, the actual application of online businesses is rare.
2. Twemproxy and its shortcomings
Twemproxy is a proxy sharding mechanism, which is open-source by Twitter. As a proxy, Twemproxy can accept access from multiple programs and forward the requests to the backend Redis servers according to routing rules.
This solution logically solves the bearing capacity problem of a single Redis instance. Of course, Twemproxy itself is also a single point, and Keepalived must be used for high-availability solutions.
I think many people should be grateful to Twemproxy. In the past few years, distributed middleware has the widest range of applications, the highest stability, and the longest-tested, should be it. However, he still has many inconveniences.
The biggest pain point of Twemproxy is that it cannot be scaled up or down smoothly.
As a result, the O & M personnel are very painful: The business volume suddenly increases and the Redis server needs to be increased; the business volume shrinks and the Redis server needs to be reduced. But for Twemproxy, it is basically difficult to operate (it is a kind of Cone-hearted, tangled pain ......).
Or, Twemproxy is more like static sharding on the server. Sometimes, in order to avoid the need for resizing due to sudden increases in business volume, we are forced to open a new Twemproxy-based Redis cluster.
Another pain point of Twemproxy is that the O & M is unfriendly and even has no control panel.
Codis just hit the two pain points of Twemproxy, and provides many other exciting features.
For more details, please continue to read the highlights on the next page: