Open-source service discovery project Zookeeper, Doozer, Etcd, zookeeperdoozer
This article is Jason Wilder's common service discovery ProjectZookeeper,Doozer,EtcdThe original address of a blog is as follows:Open-Source Service Discovery.
Service discovery is a core component of most distributed systems and service-oriented Architecture (SOA. This problem can be considered simply as: when a service exists on multiple host nodes, how can the client decide to obtain the correct IP address and port.
In traditional cases, when a service exists on multiple host nodes, static configuration is used to register service information. However, when more services need to be deployed in a large system, the process is much more complicated. In a real-time system, due to automatic or manual service expansion, or the addition and deployment of services, host downtime or replacement, the location information of the service may change frequently.
In this scenario, dynamic service registration and discovery are particularly important to avoid unnecessary service interruptions.
The topic of service discovery has been mentioned many times and is indeed constantly evolving. Now, I will introduce some open-source or frequently-discussed solutions in this field and try to understand how they actually work. Specifically, we will focus more on the consistency algorithms of each solution, whether it is strong consistency or weak consistency, runtime dependency, and the integration choice of the client; the final compromise between these features in the future.
This article begins with several highly consistent projects, such as Zookeeper, Doozer, and Etcd, which are mainly used for coordination between services and service registration.
Later, this article will discuss some interesting projects in terms of service registration and discovery, such as Airbnb's SmartStack, Netflix's Eureka, Bitly's NSQ, Serf, Spotify and DNS, the last is SkyDNS.
Problem description
When locating a Service, there are two problems: Service Registry and Service Discovery ).
- Service Registration-a process in which a service registers its location information on a node in the center. This service generally registers its host IP address and port number, and sometimes has authentication information for service access, protocol usage, version number, and environment details.
- Service Discovery-the process of registering a node in the client application instance query center to know the service location.
Some development and operation issues need to be considered for service registration and service discovery of each service:
- Monitoring-what to do when a registered service fails. In some cases, after a set timeout (timeout), the service is immediately canceled by another process at the Central Registration node. In this case, the service usually requires a heartbeat mechanism to ensure its own survival status, and the client must be able to reliably handle the service that fails.
- Server Load balancer-if multiple services with the same status have been registered, how can we balance the request load of all clients between these services? If there is a master node, can you correctly process the service location accessed by the client.
- Integration Method-does an Information Registration node need to support language binding, for example, only Java is supported? Does the integration process need to embed the registration process and discovery process code into your application, or use a process similar to the Integration assistant?
- Runtime dependency-do you need JVM, ruby, or other runtime that is not compatible in your environment?
- Availability considerations-can the system still work normally if one node is lost? Can the system be updated or upgraded in real time without causing any system paralysis? Since the cluster Information Registration node is the central part of the architecture, is there a single point of failure in this module?
The three service registration systems first introduced by Registries adopt strong consistency protocols. In fact, consistent data storage is used to achieve common results. Even though we regard them as service registration systems, they can also be used to coordinate services to assist leader election and perform centralized locking in a distributed clients collection.
Zookeeper
Zookeeper is a centralized service that maintains service configuration information, namespaces, distributed synchronization, and group services. Zookeeper is implemented in Java and implements strong consistency (CP ).Zab ProtocolCoordinate service information changes between ensemble clusters.
Zookeeper runs three, five, or seven members in the ensemble cluster. Many clients need to bind a specific language to access the ensemble. This access form is implicitly embedded into the client's application instances and services.
Service Registration is implemented through the namespaceEphemeral nodes. Ephemeral nodes exists only after the client establishes a connection. After the client node is started, the client uses a background process to obtain the location information of the client and complete its registration. If the client fails or the connection is lost, the ephemeral node will receive messages from the tree.
Service Discovery is accomplished by listing and viewing the namespace of a specific service. The Client receives information about all currently registered services, regardless of whether a service is unavailable or the system has added a similar service. At the same time, the Client also needs to handle all load balancing work and service failure work on its own.
Zookeeper APIs may be less convenient to use, because language binding may cause small differences. If the JVM-based language is used,Curator Service Discovery ExtensionIt may be helpful to you.
Because Zookeeper is a system with strong CP consistency, when a network Partition fails, some of your systems may become unregistrable, it is also possible that the existing registration information cannot be found, even if they may still work normally during the appearance of Partition. Specifically, on any non-quorum side, an error message is returned for any read/write operations.
Doozer
Doozer is a consistent distributed data storage system, which is implemented in the Go language throughPaxos AlgorithmTo achieve consensus. After several years of development, this project has been stuck for a period of time, and now some fork numbers have been disabled, reducing the number of fork to 160 .. Unfortunately, it is difficult to know the actual development status of the project and whether it is suitable for the production environment.
Doozer runs 3, 5, or 7 nodes in the cluster. Similar to Zookeeper, the Client needs to use special language binding in its own applications or services to access the cluster.
Doozer's service registration is not as direct as Zookeeper, because Doozer does not have the concept of ephemeral nodes. A service can register itself in a path. if the service is unavailable, it will not be automatically removed.
There are many ways to solve this problem. One choice is to add a timestamp and heartbeat mechanism to the registration process, and then process the timeout paths in the service discovery process, that is, the registered service information, of course, it can also be implemented through another cleanup process.
Service Discovery is similar to Zookeeper. Doozer can list all entries in a specified path, and then wait for any changes in the path. If you use a timestamp and heartbeat during registration, you can ignore or delete any expired entry during service discovery, that is, service information.
Like Zookeeper, Doozer is a CP strong consistency system. If a network partition fails, the same consequences will occur.
Etcd
EtcdIs a highly available K-V storage system, mainly used in shared configuration, service discovery and other scenarios. Etcd can be said to have been produced by Zookeeper and Doozer. The entire system is implemented using the Go language and the Raft algorithm to achieve election consistency. It also has an HTTP + JSON-based API.
Etcd, similar to Doozer and Zookeeper, usually runs 3, 5, or 7 nodes in the cluster. The client can be bound in a specific language, or by using an HTTP client.
Service Registration mainly relies on the use of a key TTL to ensure the availability of the key. The key TTL will be bundled with the server's heartbeat. If a service fails to update the TTL of the key, Etcd will time out the service. If a service becomes unavailable, the client will need to handle such a connection failure and try to connect to another service instance.
The service discovery stage is designed to list all key values in a directory, and then wait for all changes in the directory. Because the API is based on HTTP, The Etcd cluster of the client application maintains a long-polling connection.
Because Etcd is usedRaft consistency ProtocolTherefore, it should be a strongly consistent system. Raft requires a leader to be elected, and all client requests will be processed by the leader. However, Etcd also seems to support reading information from non-leaders, using undisclosed consistency parameters that improve availability while reading. During a network partition failure, the write operation will still be processed by the leader, and the operation will also fail.
Indicate the source for reprinting.
This document is more out of my understanding of the original article and is certainly flawed and wrong in some places. I hope this article will be helpful to people who are interested in service discovery. If you have better ideas and suggestions, please contact me.
My mailbox: shlallen@zju.edu.cn
Sina Weibo: @ lianzifu ruqing