Twemproxy a Redis agent from Twitter

Last Update:2015-01-01 Source: Internet

Author: User

Tags redis cluster

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When a large number of users use large-scale Redis nodes on a large scale, Redis is basically a single-case business from the perspective of the project itself.

I have a big idea about the distribution of this project. Under this idea, I don't need to do any evaluation on the multi-threaded version of Redis: from this perspective, for me, a core is like a computer. Therefore, scaling on multiple cores is equivalent to clustering between computers. Multi-instance is a shared-nothing architecture. If we find a way to slice, then everything is justified. :-)

This is why the cluster will become the focus of Redis in 2013, and finally the release of Redis 2.6 shows good stability and maturity. It is time to pay attention to Redis Cluster, Redis Sentinel, and some other long-awaited improvement of.

However, the reality is that Redis Cluster is still not released, and the official version will take several months to work. But our users already need to slice data into different instances to do load balancing, and more importantly, get more memory storage for data.

The current bottom-up solution is client segmentation. Client sharding has many benefits, such as: there is no middle tier between the client and the node, which means it is a very scalable setup (mainly linear extension). However, a stable implementation (client fragmentation) requires some optimization, a way to synchronize client configuration, and a client-side support for consistent hashing or other partitioning algorithms.

One big news came from Twitter, one of the world's largest Redis clusters deployed on Twitter to provide users with timeline data. So it's no surprise that the project discussed in this article is from the Twitter Open Source department.

Twemproxy
---

Twemproxy is a fast single-threaded agent that supports the Memcached ASCII protocol and the updated Redis protocol:

It is written in C and is licensed under the Apache 2.0 License.
The project works on Linux and cannot be compiled on OSX because it relies on the epoll API.
My test environment is Ubuntu 12.04 desktop.

Ok, gossip less. What did twemproxy do? (Note: I will focus on Redis to the part, but the project can do the same thing for memcached)

1) Act as a proxy between the client and many Redis instances.
2) Automated to data fragmentation between configured Redis instances.
3) Support for consistent hashing, supporting different to strategy and hashing methods.

The great thing about Twemproxy is that it can unload a node when it fails, and then retry (and then) reconnect after a while, or you can connect exactly to the server by the key written in the configuration file. This means that Twemproxy is capable of using Redis as a data store (which can't tolerate node misses), and can be used as a cache, which allows (which means a small amount, not a low quality) to miss and be highly available.
In summary, if you can tolerate a miss, then when there is a node failure, your data may be stored in other nodes, so it will be weakly consistent. On the other hand, if you can't tolerate a miss, you need a solution with a highly available instance, such as the automatic failover feature that is monitored using Redis.

Installation
---

Before diving into the more features of the project, there is good news that it is very easy to build on Linux. Ok, it's not as simple as Redis, but... you just need to follow the steps below:
Apt-get install automake
Apt-get install libtool
Git clone git://github.com/twitter/twemproxy.git
Cd twemproxy
Autoreconf -fvi
./configure --enable-debug=log
Make
Src/nutcracker -h

Its configuration is also very simple, there are enough documents on the github page of the project to give you a smooth first experience. I used the following configuration:
Redis1:
Listen: 0.0.0.0:9999
Redis: true
Hash: fnv1a_64
Distribution: ketama
Auto_eject_hosts: true
Timeout: 400
Server_retry_timeout: 2000
Server_failure_limit: 1
Servers:
- 127.0.0.1:6379:1
- 127.0.0.1:6380:1
- 127.0.0.1:6381:1
- 127.0.0.1:6382:1

Redis2:
Listen: 0.0.0.0:10000
Redis: true
Hash: fnv1a_64
Distribution: ketama
Auto_eject_hosts: false
Timeout: 400
Servers:
- 127.0.0.1:6379:1
- 127.0.0.1:6380:1
- 127.0.0.1:6381:1
- 127.0.0.1:6382:1

The first cluster is configured to be automatically excluded (at the time of failure) and the second cluster is configured with static mappings on all instances.

Interestingly, you can have multiple deployments at the same time for the same set of servers. However, in a production environment it is more appropriate to use multiple examples to take advantage of the capabilities of multiple cores.

Single point failure?
---

There is another interesting thing. Using this deployment doesn't mean there is a single point of failure. You can connect your client to the first available instance by running multiple sets of twemproxy.

By using twemproxy you basically separate the sharding logic from the client. In this case, a basic client can achieve the purpose, and the fragmentation will be handled entirely by the proxy.
This is a direct and safe method, personal opinion.
Now that Redis Cluster is still immature, twemproxy is a great way for most users who want to take advantage of Redis clusters. Don't be too excited, first look at the limitations of this method ;)

insufficient
---
I think Twemproxy doesn't have the commands and things that support batch operations. Of course, AFAIK is even stricter than Redis cluster, but instead allows batch operations for the same keys.
But IMHO in this way, decentralized clusters bring decentralized efficiency and bring this challenge to early users, and it takes a lot of resources to aggregate data from a large number of instances and get only "useful" The result, and you will soon start to have serious load problems, because you have too much time spent on data transmission.

However, some batch operation commands are still supported. MGET and DEL are very well handled. Interestingly, the MGET command splits the request between different servers and returns a result. This is pretty cool, maybe I can't have better performance in the future (see later).
In any case, the current state of bulk operations and unsupported things means that Twemproxy doesn't work for everyone, especially like the Redis cluster itself. In particular, it obviously does not support EVAL (I think they should support it! It is more generic, EVAL is designed to work in the proxy because the name of the key is already clear).

Places to be improved
---
The error reporting mechanism is not stable. Sending a command that Redis does not support will cause the connection to be closed. For example, sending only a ‘GET’ from redis-cli will not report an incorrect number of parameters, which will only cause the connection to hang.
Overall, other errors returned by the server can be accurately passed to the client:
Redis metal: 10000 > get list
(Error) type operation error, key matches wrong value type
Another feature I want to see is support for automatic failover. There are many alternatives:
1) twemproxy has been able to monitor instance error messages, the number of errors, and disconnect nodes if enough errors are detected. But unfortunately twemproxy can't take the slave node as an alternative, so you can send a SLAVE OFNOONE command to discard the alternate node instead of just disconnecting the wrong node. In this case twemproxy is a highly available solution.
2) Or, I want twemproxy to work with Redis Sentinel, check the Sentinel configuration regularly, and update the server configuration if something goes wrong.
3) Another alternative is to provide a way to hot-configure twemproxy. Once the node fails, Redis Sentinel can switch the ASAP proxy configuration.
There are many alternatives, and overall, it's great to be able to provide low-level support for HA (high availability).

Performance
---
Twemproxy is fast, really fast, and is close to the speed of communicating directly with Redis. I dare say that you lose up to 20% of the performance.
My only opinion on performance is that there is a more efficient way to distribute IMHO MGET commands between instances.
If twemproxy is very similar to the delay of all Redis instances (very likely), in the case where the MGETs command is issued at the same time, twemproxy will most likely receive commands from all nodes at the same time, so what I hope to see is When I run the MGET command on all instances, the number sent is the same as the number received by twemproxy, but in fact twemproxy only receives 50% of the MGET commands in one second. Maybe it's time to refactor the twemproxy response module.

in conclusion
---
This is a great project. Since Redis Cluster has not been released yet, I strongly recommend that Redis users who have requirements try Twemproxy.
I am planning to link it to the Redis project website because I think Twitter guys have made a lot of contributions to Redis with their projects, so...
This is Twitter to win the honor!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More