"Best practices for efficient operation and maintenance" is a boutique column launched by Infoq in 2015, written by Touch technology operations Director Shida, INFOQ Editor-in-chief Ben Linders.
Preface
as the opening article says, efficient operation includes specialization of management and specialization of technology. Top two we are mainly talking about management-related content, this article about technical specialization. I hope readers can adapt to this conversion, thank you.
The Internet has entered the Web 2.0 era a few years ago, and the demand for back-office support has increased dozens of times times or even hundreds of times times. In this evolutionary process, the caching system plays a pivotal role.
operations evolved to this day, is not the time to reinvent the wheel. As a result, we can choose the best open source products in the framework optimization and the self -moving operation, instead of completely starting from scratch (except for various technical geek).
This paper mainly discusses the technology and new development of Redis cluster, and discusses the operation and maintenance of Redis and other topics.
This paper focuses on the recommendation of codis--Pea pod Open source Redis distributed middleware (the project was open source at GitHub 4 months ago and currently has more than 2100 star). It has many exciting new features compared to Twemproxy, and supports seamless migration from Twemproxy to Codis.
The main contents of this article are as follows, for Redis more familiar with friends, can skip the first two parts, directly enjoy Codis related content.
1. Redis Common Clustering Technology
1.1 Client Shards
1.2 Proxy Shards
1.3 Redis Cluster
2. Twemproxy and deficiencies
3. Codis Practice
3.1 Architecture
3.2 Performance Comparison test
3.3 Usage tips and precautions
All right, let's get started.
1. Redis Common Clustering Technology
for a long time, Redis itself supports only single instances, with memory typically up to 10~20GB. This does not support the need for large online business systems. And it also makes resource utilization too low--after all, server memory is 100~200GB.
In order to solve the problem of insufficient load capacity of single machine, the major Internet enterprises have been shot, "self-help" to achieve the cluster mechanism. In these unofficial cluster solutions, the data "Shard" (sharding) is physically stored on multiple Redis instances, and in general, each "slice" is a Redis instance.
including the official recently launched Redis Cluster,redis Cluster has three implementation mechanisms, described below, we hope to help you choose.
1.1 Client Shards
This scenario places the Shard work on the business program side, and the program code directly accesses multiple Redis instances based on pre-set routing rules. The advantage is that you do not rely on third-party distributed middleware, the implementation methods and code are self-control, can be adjusted at any time, do not worry about stepping on the pit.
this is actually a static sharding technique. Redis instance of increase or decrease, you have to manually adjust the Shard program. Open source products based on this fragmentation mechanism are still rare.
The performance of this fragmentation mechanism is better than that of proxies (one less intermediate distribution link). But the disadvantage is the escalation of trouble, strong personal dependence on research and development personnel-need to have strong program development capabilities to support. If the main programmer leaves, the new owner may choose to rewrite it again.
therefore, in this way, the operational dimension is poor. Failure, positioning and resolution are developed and operational coordination with the solution, the failure time becomes longer.
This solution is difficult to standardize operations and is less suitable for small and medium companies (unless there is enough devops).
1.2 Proxy Shards
this scenario, the Shard work is given to a specialized agent to do. The agent receives data requests from the business program, distributes the requests to the correct Redis instance and returns them to the business program according to the routing rules.
In this mechanism, a third-party agent is generally chosen (rather than developed by itself) because the backend has multiple Redis instances, so this type of program is also known as distributed middleware.
The advantage is that business processes do not care about back-end Redis instances, and it is convenient to operate them. While this can result in some performance losses, it is relatively tolerable for Redis's memory read-write applications.
This is the recommended cluster implementation scenario. Twemproxy, an open source product based on this mechanism, is one of the most widely used.
1.3 Redis Cluster
in this mechanism, there is no central node (and the important difference between the proxy mode). Therefore, all the happy and unhappy things will be based on this to unfold.
The redis cluster maps all keys into 16,384 slots, each of which is part of a Redis instance in the cluster, and the business program operates through an integrated redis cluster client. The client can make a request to either instance, and if the required data is not in that instance, the instance directs the client to read and write data to the corresponding instance automatically.
the Member Management (node name, IP, port, status, role) of Redis cluster is exchanged and updated on a regular basis through 22 communication between nodes.
This shows that this is a very "heavy" scheme. is not "simple, can rely" on a Redis single instance. Perhaps this is one of the reasons for the recent release after years of delay.
It reminds me of a history. Because Memcache does not support persistence, so someone wrote a membase, later renamed Couchbase, said to support auto Rebalance, for several years, so far not many companies in use.
itis a worrying solution. To address cluster management issues such as quorum, Oracle RAC also uses a piece of space on the storage device. And Redis Cluster is a complete de-centering ...
This program is not recommended at present, from the understanding of the situation, the actual application of the online business is not much see.
2. Twemproxy and deficiencies
TWemproxy is a proxy shard mechanism, open source by Twitter. Twemproxy as an agent, can accept access from multiple programs, according to routing rules, forwarded to the background of the various Redis servers, and then the original path back.
This solution is a logical solution to the problem of a single Redis instance carrying capacity. Of course, Twemproxy itself is a single point, need to use keepalived to do high-availability programs.
I think a lot of people should thank twemproxy, for all these years, the most widely used, most stable, and most tested distributed middleware should be it. It's just that he has a lot of inconvenient places.
Twemproxy The biggest pain point is that it is not possible to expand/shrink smoothly.
This led to the operation of the students very painful: the volume of business burst, the need to increase the Redis server, shrinking business, need to reduce the Redis server. But for Twemproxy, it's basically hard to operate (it's a cone-shaped, tangled pain ...). )。
or, Twemproxy is more like a server-side static sharding. Sometimes it is forced to open a new Redis cluster based on Twemproxy in order to avoid the expansion demand caused by the sudden increase in business volume.
Twemproxy Another pain point is that operation is unfriendly and there is even no control panel.
Codis just hit the two big pain points in Twemproxy and offers many other exciting features.
3. Codis Practice
Codis by the pea pod in November 2014 Open source, based on go and C development, is a recent emerging, the Chinese developed one of the outstanding open source software. Now widely used in pea pods of various redis business scenarios (has been confirmed by the Pea pod classmate, hehe).
from 3 months of various stress tests, stability meets the requirements of efficient operation and maintenance. Performance is much improved, initially 20% slower than twemproxy; now nearly 100% faster than Twemproxy (condition: Multi-instance, general value length).
3.1 Architecture
Codis introduced the group concept, with each group consisting of 1 Redis master and at least 1 Redis Slave, which is one of the differences with Twemproxy. The advantage of this is that if there is a problem with the current master, the OPS person can switch to slave by dashboard "self-service" without having to modify the program configuration file carefully.
to support data thermal migration (Auto Rebalance), the producer modified the Redis server source code and called Codis server.
The Codis uses a pre-shard (pre-sharding) mechanism, which is pre-defined and divided into 1024 slots (that is, up to 1024 CODIS servers at the back end), these routing information Saved in the zookeeper.
Zookeeper also maintains Codis Server group information and provides services such as distributed locks.
3.2 Performance Comparison test
Codis is still being improved. Its performance, from the initial 20% slower than the Twemproxy (although this is not obvious for memory-based applications), is now far more than twemproxy performance (under certain conditions).
we tested it for up to 3 months. The test is based on Redis-benchmark, respectively, for Codis and Twemproxy, testing the performance and stability of the value length from 16B~10MB, and performing multiple rounds of testing.
A total of 4 physical servers were involved in the test, one of which was deployed Codis and Twemproxy, and the other three deployed Codis server and Redis server to form two clusters.
from the test results, in terms of the set operation, the CODIS performance is superior to twemproxy (within the value length range of the General Service) when the value length is <888b.
Codis performance has been better than twemproxy in terms of get operations.
3.3 Usage tips and precautions
Codis also has a lot of fun stuff, from the actual use, some places also worth noting.
1) Seamless migration Twemproxy
The Codis-port tools are carefully prepared by the producers. It allows you to synchronize the Redis data under Twemproxy to your Codis cluster in real time. After the synchronization is complete, just modify the program configuration file, change the address of the Twemproxy to Codis address. Yes, just so much to do.
2) Ha that supports Java programs
Codis provides a Java client and is called Jodis (the name is cool, right?). )。 This way, if a single Codis proxy goes down, Jodis automatically discovers it and automatically circumvents it, leaving the business unaffected (really cool!). )。
3) Support Pipeline
pipeline allows the client to issue a batch of requests and obtain the return result of the request at once. This enhances the imagination of the Codis space.
from the actual test, the set performance increases rapidly when the value length is less than 888B bytes .
Get performance is also the same.
4) Codis is not responsible for master-slave synchronization
in other words, CODIS is only responsible for maintaining the current Redis server list, by the OPS personnel to ensure the consistency of the master-slave data.
This is one of my most admired places. The good thing is that you didn't make the codis so heavy. is one of the reasons we dare to let go online in the environment .
5) Follow-up expectations for Codis?
All right, let's say two of them. Hope Codis don't get too heavy. In addition, add pipeline parameter, value length if large, performance is rather lower than twemproxy, hope to have improved (we multi-wheel pressure test results are so).
Redis Cluster technology and CODIS practice