This is a creation in Article, where the information may have evolved or changed.
Recently open source Codis, the response on Github is very good, 3 days have been collected more than 1000 stars, let me be more surprised. It also explains from the side that it is true that distributed caching is a problem that everyone will encounter. So I'm going to explain in detail the design of Codis and some of the considerations behind it in this and the next few blogs, as well as some thoughts on distributed storage (especially caching) systems.
Why proxy?
Codis's architecture adopted the proxy-based design, without taking the official Cluster Road, the official Cluster implementation is the peer model, relying on the GOSSIP protocol for message synchronization and data to be divided into several slots as a management unit, the client needs to change. The benefits of this model are:
- True Non-center node
- The requested performance does not suffer too much for the client
But the drawbacks are equally obvious:
- The state is hard to define, and it's hard to know exactly what the cluster is in now.
- For Redis, clustering is difficult to upgrade because it binds the distributed logic and storage engine together.
- Need to rely on smart client
These two drawbacks exist in almost any distributed system of the peer model, and because of the first problem, the process of development and debugging is also very difficult (the official cluster almost 3 years before it is more stable)
The benefits of the proxy-based approach are more obvious:
- Low development costs
- Low switching costs for business
- The logical and stored logic of a Proxy is isolated
So, before Codis, Twemproxy is the best choice for this program, the application is very extensive, many large internet companies are using it, but Twemproxy also has its problems, the biggest problem is: Twemproxy really just a Proxy, The cluster does not function at all. And it seems that Twitter is not going to maintain it.
Twitter's latest talk, which mentions some of the practices and ideas within Twitter about Scaling Redis, is highly prized for the Proxy scheme (and also mentions that they are no longer using twemproxy ...), The reasons for this are relatively clear, interested can go to see. Similarly, Facebook's previous paper on extending memcached also mentions a similar scheme (mcrouter). One of the most important ideas behind this scenario, I think, is the separation of storage and distributed logic. As for the performance loss due to the forwarding request, it can be back up in other ways, such as horizontally extending the Proxy. The benefit is that the whole system state is very clear, almost all components can be independently deployed and upgraded, the program is also relatively good to write, so Codis from the beginning of the firm to take the Proxy this path.
However, compared to Twemproxy, Codis has some improvements, first integrating the functions of the cluster, using presharding to store the scattered data. All of the cluster state information is dependent on ZooKeeper for synchronization, and all proxies are stateless. This enables the horizontal expansion of multiple proxies.
Another important decision is to use go as the main development language, throw away the issue of faith (I and @goroutine are the brain residue of Go), go is almost the language for this backend network program, which brings great efficiency to the development work, from writing down the first line of code, To the first available version, it took almost less than one months.
Of course, the use of multi-Proxy for the migration process of data consistency brings some problems, the next blog will introduce how Codis is solved.