ETCD Study (ii) cluster construction clustering

Source: Internet
Author: User
Tags etcd






1. Single ETCD node (for test development)






Before I have been developing a test that has been used for a ETCD node, then the Start command has been directly hitting a ETCD (I have added the bin directory of the ETCD installation directory to the PATH environment variable), and then the boot information shows ETCD server listens on the default port 4001, peer Server listens on the default port of 7001.






or specify the path and name: Etcd-data-dir/usr/local/etcddata/machine0-name machine0












2, three ETCD nodes composed of clustering






Then today you want to test the cluster functionality, just follow the tutorial on Guthub:






Reference: HTTPS://GITHUB.COM/COREOS/ETCD/BLOB/MASTER/DOCUMENTATION/CLUSTERING.MD






Let start by creating 3 new ETCD instances.






We Use the-peer-addrspecify server ports and to-addrspecify client ports and to specify the directory to store the-data-dirlog a nd info of the machine in the cluster:




./etcd-peer-addr 127.0.0.1:7001-addr 127.0.0.1:4001-data-dir machines/machine1-name machine1





Note: if want to run ETCD in an external IP address and still has access locally, you'll need to add  ;-BIND-ADDR 0.0.0.0So, it'll listen on both external and localhost addresses. A similar Argument -PEER-BIND-ADDR is used to setup the listening address for the server port.






Let's join the machines to the cluster using the-peersargument. A single connection to any peer would allow a new machine to join, but multiple can is specified for greater resiliency.




./etcd-peer-addr 127.0.0.1:7002-addr 127.0.0.1:4002-peers 127.0.0.1:7001,127.0.0.1:7003-data-dir machines/machine2 -name machine2./etcd-peer-addr 127.0.0.1:7003-addr 127.0.0.1:4003-peers 127.0.0.1:7001,127.0.0.1:7002-data-dir Machines/machine3-name Machine3





Note:






We can also get the current leader in the cluster:




curl -L http://127.0.0.1:4001/v2/leader




We can retrieve a list of machines in the cluster using the HTTP API:




Curl-l Http://127.0.0.1:4001/v2/machines




Open three terminals the above three commands are executed in a single order.






Then perform a get operation to view the contents of the node that was added to my previous single node:




Curl-l Http://127.0.0.1:4002/v2/keys/configA




The results found that key not found hint, is not the original node based on the addition of two nodes to form a cluster, will lead to the loss of data before?






After studying this command and discovering that the data store path was specified, I guess:






(1) As long as the ETCD command running at the same time <ip, port> does not conflict, you can start multiple ETCD nodes at the same time.






(2) Instant start-up at different times on the same <IP,Port>, as long as the data path specified is not the same, nor is the same ETCD node.












So I turned off the three terminals I just opened, or ran my previous ETCD command (which data path I didn't know by default), and then performed a get operation to view the contents of the node that was added to my previous single node:




Curl-l Http://127.0.0.1:4002/v2/keys/configA




The discovery is all in, and it seems that the three ETCD nodes that I started later were not clustering to the ETCD node that I started, because the same data path was not used.












3, three ETCD nodes make up the data persistence of clustering






We have just turned off the nodes of the three ETCD clusters and now restart the three nodes. The nodes that were written before and the values are still there, indicating that there is no problem with persistence.






Then I found under the/home directory machines This directory, the following three machines,machine2,machine3 all deleted, again with the above three command to start the cluster, again to view the previous node, found that no longer exists, Indicates that the data for the cluster is stored under its specified data path.






Note: So, if you want to completely re-use your ETCD server, that is, to clear all the previous data, the directory is deleted.












4, three ETCD nodes clustering should have access to that (make an action request)






(1) for read operation of three any one can, even if it is not leader






(2) for write operations, it seems that the write can only be done by connecting leader.






I have a three-node cluster (127.0.0.1:4001, 127.0.0.1:4002, and 127.0.0.1:4003), which has a program that connects to the cluster-on Timer timer registration service (which is actually timed to create a node with TTL), as follows:




String sysflag = "Cbip";

Iregistrycenterclient rcenter = Registrycenterclientfactory.getregistrycenterclient ();

ServiceInfo sInfo1 = new ServiceInfo ();

Sinfo1.servicename = "HelloService";

Sinfo1.serviceip = "127.0.0.111";

Sinfo1.serviceport = 1888;

Rcenter.registerservice (SINFO1);

While (true)

{

Console.WriteLine (Rcenter.getconfigitem (Sysflag, "Configa"));

Console.WriteLine (Rcenter.getconfigitem (Sysflag, "CONFIGB"));

Thread.Sleep (200);

}




I am connected to the 127.0.0.1:4001 node in the cluster, and the leader of the cluster at the beginning is 127.0.0.1:4001, but leader will change over time and may become 127.0.0.1:4002 or 127.0.0.1 : 4003, I found a conclusion: as long as leader is 127.0.0.1:4001, the service will be able to successfully register (successfully written to the cluster), as long as leader is not 127.0.0.1:4001, will register failure! The Read configuration item in the loop is always valid and does not expire with the leader change.






Question: Why do I follow this tutorial launched three nodes of the cluster, time-lapse, leader will change to change???






ETCD is still relatively new, is still in development, the 1.0 version has not come out, let us wait and see @!












5, must be three nodes composed of clustering?






To build a ETCD cluster, you need at least three nodes.






More than three nodes are possible, but once more than 9, the ETCD cluster will only run a subset of the raft algorithm as a cluster, and the rest of the nodes will run in a separate boot as a spare tire.






So the 3-9 is the most suitable one.






However, as can be seen from the table below, because of the write delay and reliability of the two problems, 3-9 between the odd nodes of the cluster is always the most efficient and optimal.












6, the nodes in the cluster are distributed on many different machines, the effect is the same?






The same.


















=========== below is my translation from Etcd's Guthub ==============




Optimal ETCD Cluster Size



ETCD's raft consistency algorithm is most effective on smaller clusters (3-9 nodes), and for clusters of more than 9 nodes, ETCD will select a subset of all nodes to execute the raft algorithm to ensure effectiveness.




Cluster Management



You can manage the size of the active cluster by cluster config API, whichactiveSizedescribes the number of active nodes (Etcd peers) of the ETCD cluster.






If the total number of ETCD instances exceeds this number, then the extra nodes (peers) will be launched in a stand-alone (standbys) manner, and if an active node in the cluster is hung or removed, then these extra separate nodes will be added to the active cluster.




Internals of Etcdwriting to ETCD



Writing a ETCD node is always redirected to the leader of the cluster, and is distributed to all nodes in the cluster, and only if most nodes (Majority---See the table below) confirm that the write operation succeeded.






For example, a cluster with a node, then a write operation is the fastest to wait for a successful write three nodes before the write success. This is why the number of nodes is best less than 9, and we need to consider the high performance of writes (low latency).




Leader election



The leader election process is similar to writing a key, and most nodes in the cluster need to recognize the new leader in order to continue the cluster-related operations.




ODD Active Cluster Size



An important cluster optimization strategy is to ensure that the number of active nodes in the cluster (i.e. activesize) is always an odd number.






For example, you see 3 nodes compared to 4 nodes, 5 nodes compared with 6 nodes, 7 nodes and 8 nodes compared: The number of majority increased, resulting in write operation delay higher, but failure tolerance number did not increase, That is, the reliability (number of nodes that are allowed to hang) does not increase.




Active Peers Majority Failure Tolerance

1 peers 1 peers None

3 peers 2 peers 1 peer

4 peers 3 peers 1 peer

5 peers 3 peers 2 peers

6 peers 4 peers 2 peers

7 peers 4 peers 3 peers

8 peers 5 peers 3 peers

9 Peers 5 peers 4 peers



As you can see, adding a new node award in the cluster is always worth the odd number of nodes.






During a network partition, an odd number of active peers also guarantees that there would almost always be a majority of t He cluster that can continue to operate and is the source of the truth when the partition ends.






Etcd Learning (ii) cluster build clustering




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.