Redis cluster explanation and Codis practice analysis

Source: Internet
Author: User
Tags redis rounds zookeeper server memory redis cluster redis server advantage

This article mainly discusses the Redis cluster related technology and the new development.

This article focuses on recommending codis--pea pods Open source Redis distributed middleware (the project was open source in GitHub 4 months ago and currently has more than 2100 star). Compared with Twemproxy, it has many exciting new features and supports seamless migration from Twemproxy to Codis.

The main contents of this article are as follows: for Redis friends who know more, they can skip the first two parts and directly appreciate Codis related content.

1. Redis Common Cluster Technology
   1.1 client Fragmentation
   1.2 proxy fragmentation
   1.3 Redis Cluster
2. Twemproxy and deficiencies of the
3. Codis Practice
   3.1 Architecture
   3.2 Performance comparison test
   3.3 usage tips, precautions

Okay, we're officially starting.

1. Redis Common cluster Technology

for a long time, Redis itself only supports single instances, and memory is generally the most 10~20gb. This does not support the need for large, online business systems. And it also makes the utilization of resources too low--after all, server memory is 100~200GB.

In order to solve the problem of insufficient capacity of single machine, all the major Internet enterprises have shot, "self-service" to achieve the cluster mechanism. In these unofficial cluster solutions, data "fragmentation" (sharding) is physically stored in multiple Redis instances, and in general, each "slice" is a Redis instance. The

includes three implementation mechanisms for the recently launched Redis Cluster,redis cluster, which are described below, and are expected to help you choose.


1.1 Client Fragmentation

This scheme puts the slicing work on the business program side, and the program code directly distributed access to multiple Redis instances according to the predetermined routing rules. The advantage is that, without relying on third-party distributed middleware, implementation methods and code are in their own hands, can be adjusted at any time, do not worry about stepping on the pit.

This is actually a static slicing technique. Redis instance of increase or decrease, have to manually adjust the slicing program. Open source products based on this fragmentation mechanism are still rare.

The performance of this partitioning mechanism is better than that of the proxy (an intermediate distribution link is missing). But the disadvantage is the escalation of trouble, the development of personal dependence strong-need to have a strong program development ability to do backing. If the main programmer leaves the job, perhaps the new owner will choose to rewrite it again.

Therefore, in this way, the operational dimension is poor. Failure, positioning and resolution have to be developed and operational dimensions with the solution, the fault time becomes longer.

This scheme is difficult to standardize and is not suitable for small and medium-sized companies (unless there is sufficient devops).


1.2 Proxy fragmentation

In this scenario, the fragmentation work is delegated to a specialized agent program. The agent receives data requests from business programs and, according to routing rules, distributes these requests to the correct Redis instance and returns them to the business program.


In this mechanism, a third party agent (rather than research and development) is usually chosen, because the backend has multiple Redis instances, so this kind of program is also called distributed middleware.

The advantage of this is that the business process does not need to care about the backend Redis instance, the Operation dimension is also convenient. Although this will result in some performance loss, but for Redis this memory read-write application, is relatively tolerable.

This is our recommended cluster implementation scenario. Twemproxy, an Open-source product based on this mechanism, is one of the most widely used.


1.3 redis Cluster

In this mechanism, there is no central node (important difference from the proxy mode). Therefore, all the happy and unhappy things, will be based on this and start.

Redis cluster maps all keys to 16,384 slot, where each Redis instance in the cluster is responsible, and the business program operates through an integrated Redis cluster client. The client can make a request to either instance, and if the required data is not in that instance, the instance directs the client to automatically read and write data to the instance.

Redis cluster member management (node name, IP, port, status, role), etc., are exchanged and updated periodically through 22 of communication between nodes.

Thus, this is a very "heavy" scenario. is no longer "simple, Redis" for a single instance. Perhaps this is one of the reasons for the recent release after years of delay.

This is reminiscent of a history. Because Memcache does not support persistence, so someone wrote a membase, later renamed Couchbase, said to support auto rebalance, for several years, so far not many companies in use.

This is a worrying scenario. To address cluster management issues such as arbitration, Oracle RAC also uses a space for storage devices. And Redis Cluster, is a kind of completely to center ...

This scenario is not currently recommended, and from the point of view of the situation, the actual application of the online business is not much seen.


2. Twemproxy and deficiencies

Twemproxy is a proxy partitioning mechanism that is open source by Twitter. Twemproxy as an agent, can accept access from multiple programs, according to routing rules, forwarding to the background of the various Redis servers, and then return the original path.

This solution is a logical solution to the problem of single Redis instance carrying capacity. Of course, Twemproxy itself is also a single point, the need to use keepalived to do high availability scenarios.

I think a lot of people should thank twemproxy, for all these years, the most widely used, the most stable, the most time-tested distributed middleware, it should be it. It's just that he has a lot of inconvenient places.

The biggest pain point of twemproxy is that it cannot be enlarged/shrunk smoothly.

This led to Yun-dimensional classmate very painful: business volume increased, need to increase Redis server, business volume shrinking, need to reduce Redis server. But for Twemproxy, it's basically hard to operate (it's a cone-shaped, tangled pain ...). )。

Alternatively, Twemproxy is more like a server-side static sharding. Sometimes in order to avoid the increase in business volume caused by the expansion of demand, even forced to open a new based on Twemproxy Redis cluster.

Twemproxy Another pain point is that the op-dimensional is unfriendly and doesn't even have control panel.

Codis just hit the two major pain points in Twemproxy and offers many other exciting features.


3. Codis Practice

Codis by Pea pods in November 2014 open source, based on go and C development, is the recent emergence of the Chinese developed a good open source software. It has been widely used in various redis business scenes of pea pods (have been confirmed by the pea pod @ Liu Qi classmate, hehe).

From the 3 months of various stress tests, stability meets the requirements of efficient operation. Performance is improved a lot, initially slower than Twemproxy 20%, now faster than twemproxy nearly 100% (conditions: Multiple instances, general value length).


3.1 Architecture

Codis introduced the concept of group, each group including 1 Redis master and at least 1 Redis Slave, which is one of the differences with Twemproxy. The advantage of this is that if there is a problem with the current master, the operator can switch to Slave by dashboard "self-service" without the need to carefully modify the program configuration file.

To support data thermal migration (Auto rebalance), the producer modifies the Redis server source code and calls it Codis server.

Codis uses a pre-sharding mechanism that is previously defined, divided into 1024 slots (that is, up to 1024 CODIS servers on the back end), which are stored in zookeeper.


Zookeeper also maintains Codis Server group information and provides services such as distributed locks.


3.2 Performance Comparison test

Codis is still being improved in the pursuit of excellence. Its performance, from the initial 20% slower than Twemproxy (although this for memory applications, is not obvious), to now far more than twemproxy performance (under certain conditions).

We tested for up to 3 months. The tests are based on Redis-benchmark, respectively, for Codis and Twemproxy, to test the performance and stability of the value length from 16B~10MB, and to conduct multiple rounds of testing.

A total of 4 physical servers were involved in the test, one of which was deployed Codis and Twemproxy, and the other three deployed Codis server and Redis server to form two clusters.

From the test results, in terms of the set operation, the CODIS performance is superior to twemproxy in the value length <888b (this is within the value length range of the general business).


Codis performance has always been superior to twemproxy in terms of get operations.


3.3 Use tips, precautions

Codis also has a lot of fun stuff, from the actual use, some places also worth noting.


1) Seamless Migration twemproxy

The producer carefully prepared the Codis-port tool. Through it, you can synchronize the Redis data below Twemproxy to your Codis cluster in real time. After synchronization is complete, simply modify the program configuration file and change the Twemproxy address to the Codis address. Yes, it only needs to be done so much.


2 support Java program of HA

CODIS provides a Java client and is called Jodis (the name is cool, isn't it?) )。 In this way, if a single Codis proxy is down, the Jodis automatically discovers, and automatically avoids it, leaving the business unaffected (really cool!). )。


3) Support Pipeline

Pipeline allows the client to issue a batch of requests and obtain the return result of the request at a one-time. This increases the Codis imagination.

From the actual test, the set performance increases rapidly when the value length is less than 888B bytes.


Get performance is also true.


4) Codis is not responsible for master-slave synchronization

In other words, CODIS is only responsible for maintaining the current Redis server list, and the operators themselves to ensure the consistency of the master and subordinate data.

This is one of the places I admire most. The advantage of this is that the codis have not been so heavy. It is one of the reasons we dare to let go of the online environment on-line.


5 The follow-up expectation of the Codis?

Well, say two. I hope Codis don't get too heavy. In addition, after adding pipeline parameters, the value length, if larger, performance is lower than twemproxy, hope to have improved (we have many rounds of pressure test results).

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.