Etcd: a comprehensive explanation from application scenarios to implementation principles

Source: Internet
Author: User

Etcd: a comprehensive explanation from application scenarios to implementation principles

As CoreOS and Kubernetes and other projects become increasingly popular in the open-source community, the etcd components used in their projects become a highly available and consistent service discovery and storage repository, which is gradually gaining attention from developers. In the cloud computing era, how to allow services to be quickly and transparently connected to computing clusters and how to quickly discover shared configuration information by all machines in the cluster is even more important, how to build such a high-availability, secure, easy-to-deploy, and fast-responding service cluster has become an urgent issue. Etcd brings the gospel to solve such problems. This article will begin with the application scenarios of etcd and explain the implementation methods of etcd in depth for developers to fully enjoy the convenience brought by etcd.

Typical application scenarios

What is etcd? Many people may first respond to a key-value storage warehouse, but do not pay attention to the last half of the sentence officially defined for configuration sharing and service discovery.

A highly-available key value store for shared configuration and service discovery.

In fact, as a project inspired by ZooKeeper and doozer, etcd not only has similar features, but also focuses on the following four points.

  • Simple: HTTP + JSON-Based APIS allow you to use curl easily.
  • Security: SSL client authentication is optional.
  • Fast: each instance supports one thousand write operations per second.
  • Trusted: the Raft algorithm is used to fully implement the distributed architecture.

With the continuous development of cloud computing, more and more people pay attention to the problems involved in distributed systems. Inspired by the overview of typical application scenarios of ZooKeeper by Alibaba middleware team, I have summarized some typical use cases of etcd based on my own understanding. Let's take a look at how etcd, a distributed storage repository based on the Raft strong consistency algorithm, can help us.

It is worth noting that data in Distributed Systems is divided into control data and application data. In the etcd scenario, the data processed by default is the control data. For application data, only a small amount of data is recommended, but the updated access is frequent.

Scenario 1: Service Discovery)

Service Discovery is also one of the most common problems in distributed systems, that is, how can processes or services in the same distributed cluster be located and connected to each other. In essence, service discovery is intended to know whether a process in the cluster is listening for udp or tcp ports, and can be searched and connected by name. To solve service discovery problems, there must be three pillars.

  1. A highly consistent and highly available service storage directory. The Raft algorithm-based etcd is inherently such a highly consistent and highly available service storage directory.
  2. A mechanism for registering and monitoring the health status of a service. You can register the service in etcd, set the key TTL for the registered service, and regularly keep the heartbeat of the Service to monitor the health status.
  3. A mechanism for searching and connecting to services. Services registered under the topic specified by etcd can also be found under the corresponding topic. To ensure the connection, we can deploy a Proxy-mode etcd on each service machine, so that services that can access the etcd cluster can be connected to each other.

Figure 1 service discovery

The following describes the specific scenarios of service discovery.

  • In the microservice collaborative working architecture, services are dynamically added. With the popularity of Docker containers, a variety of microservices work together to form a relatively powerful architecture. The demand for transparent dynamic addition of these services is also growing. Through the service discovery mechanism, register a service name directory in etcd to store the IP addresses of available server nodes. In the process of using the service, you only need to find available service nodes in the service directory to use.

Figure 2 microservice collaboration

  • Transparent restart of applications with multiple instances and instances on the PaaS platform. Applications on the PaaS platform generally have multiple instances. Through the domain name, you can not only access these instances transparently, but also achieve load balancing. However, an instance of the application may be restarted at any time. In this case, you need to dynamically configure information in domain name resolution (Routing. Through the Service Discovery Function of etcd, you can easily solve this dynamic configuration problem.

Figure 3 multi-instance Transparency on the cloud platform

Scenario 2: Message publishing and subscription

In a distributed system, message publishing and subscription are the most suitable communication methods among components. A configuration sharing center is built. The data provider publishes messages in this configuration center, and message users subscribe to topics of interest. Once a message is published on a topic, the subscriber is notified in real time. In this way, centralized management and dynamic updates of distributed system configurations can be achieved.

  • Some configuration information used in the application is put on etcd for centralized management. This type of scenario is usually used in the following way: When an application is started, it takes the initiative to obtain the configuration information from etcd. At the same time, it registers a Watcher on the etcd node and waits, when the configuration is updated in the future, etcd will notify the subscriber in real time to obtain the latest configuration information.
  • In the distributed search service, metadata of indexes and node status of server cluster machines are stored in etcd for subscription by clients. The key TTL function of etcd ensures that the machine status is updated in real time.
  • Distributed log collection system. The core task of this system is to collect logs distributed on different machines. The collector usually collects task units by application (or topic). Therefore, you can create a directory named after application (topic) on etcd, and store all the machine ip addresses of this application (subject-related) to the directory P as sub-directories, and then set an etcd recursive Watcher for Recursive monitoring application (topic) changes to all information under the directory. In this way, when the IP address (Message) of the machine changes, the collector can be notified in real time to adjust the task allocation.
  • Information in the system must be dynamically and automatically obtained and manually modified to modify the information request content. Usually expose interfaces, such as JMX interfaces, to obtain some runtime information. After the introduction of etcd, you don't need to implement a solution by yourself. You just need to store the information in the specified etcd directory. These directories of etcd can be accessed externally through the HTTP interface.

Figure 4 message publishing and subscription

Scenario 3: Server Load balancer

Server Load balancer is also mentioned in scenario 1. All Server Load balancer mentioned in this article is soft Server Load balancer. In distributed systems, to ensure high service availability and data consistency, data and services are usually deployed in multiple copies to achieve peer-to-peer services. Even if one of the services fails, it does not affect usage. The downside is that the data write performance decreases, while the advantage is load balancing during data access. Because each peer service node contains complete data, user access traffic can be distributed to different machines.

  • Etcd supports Load Balancing for information access stored in the distributed architecture. After etcd is clustered, each core node of etcd can handle user requests. Therefore, it is also a good choice to store small data volumes but frequently accessed message data directly to etcd, such as the second-level code table commonly used in the business system (store code in the table, store the specific meaning of the Code in etcd. You need to find the meaning of the Code in the table when the business system calls the look-up table ).
  • Use etcd to maintain a Server Load balancer node table. Etcd can monitor the status of multiple nodes in a cluster. When a request is sent, it can forward the request to multiple active states in a round-robin manner. Similar to KafkaMQ, ZooKeeper is used to maintain load balancing between producers and consumers. You can also use etcd for ZooKeeper.

Figure 5 Server Load balancer

Scenario 4: distributed notification and coordination

The distributed notification and Coordination mentioned here are similar to message publishing and subscription. The Watcher mechanism in etcd is used to implement notifications and coordination between different systems in a distributed environment through registration and asynchronous notification mechanisms, so as to process data changes in real time. The implementation method is usually as follows: Different Systems register the same directory on etcd, and set Watcher to observe the changes to the directory (if the change to the subdirectory is also required, you can set the recursive mode). When a system updates the etcd directory, the Watcher system will receive a notification and process it accordingly.

  • Use etcd for low-coupling heartbeat detection. The Detection System and the tested system are associated rather than directly through a directory on etcd, which greatly reduces system coupling.
  • Use etcd to schedule the system. A system consists of the console and the push system. The console is responsible for controlling the push system to push the system. Some operations performed by the management personnel on the console are actually modifying the status of some directory nodes on etcd, and etcd sends these change notifications to the push system client registered with Watcher, the push system then makes corresponding push tasks.
  • Complete the work report through etcd. In most similar task distribution systems, after a subtask is started, register a temporary working directory on etcd and report its progress (write the progress to this temporary directory ), in this way, the task manager can know the task progress in real time.

Figure 6 distributed collaboration

Scenario 5: distributed locks

Because etcd uses the Raft algorithm to maintain strong data consistency, the values stored in a certain operation in the cluster must be globally consistent, so it is easy to implement distributed locks. The lock service can be used in two ways. One is to maintain the exclusive status, and the other is to control the time sequence.

  • If the lock is exclusive, only one user can obtain the lock. Etcd provides a set of APIS for implementing the Distributed Lock atomic operation CAS (CompareAndSwap. By setting the prevExist value, you can ensure that only one directory is successfully created when multiple nodes create a directory at the same time. A successfully created user can think that the lock is obtained.
  • Control sequence, that is, all users who want to obtain the lock will be scheduled to execute, but the order of obtaining the lock is also globally unique and determines the execution sequence. For this reason, etcd also provides a set of APIS (automatic creation of ordered keys), which are specified as POST actions when creating a directory value, in this way, etcd will automatically generate a key in the directory and store the new value (client number ). You can also use the API to list the key values in all the current directories in sequence. The values of these keys are the time sequence of the client, and the values stored in these keys can be the numbers of the client.

Figure 7 distributed locks

Scenario 6: Distributed Queue

The conventional usage of distributed queues is similar to the control sequence usage of distributed locks described in scenario 5, that is, to create an FIFO queue and ensure the order.

Another interesting implementation is to ensure that the queue is executed in a unified order when it reaches a certain condition. The implementation of this method can create another/queue/condition node in the/queue directory.

  • Condition can indicate the queue size. For example, a large task can be executed only when many small tasks are ready. Each time a small task is ready, add 1 to the condition number until the number specified by the large task is reached, then begin to execute a series of small tasks in the queue, and finally execute large tasks.
  • Condition indicates that a task is not in the queue. This task can be the first execution program of all sorting tasks, or a vertex without dependency in the topology. Generally, other tasks in the queue can be executed only after these tasks are executed.
  • Condition can also indicate other types of notifications for starting to execute tasks. It can be specified by the control program. When the condition changes, the queue task starts to be executed.

Figure 8 Distributed Queue

This article permanently updates the link address:

  • 1
  • 2
  • 3
  • 4
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.