Before the company maintained a name service, the name of the service daily management of nearly 4000 machines, there are about 4,000 clients connected to get machine information, because it is basically a single point of service, so some modules close to the bottleneck. Then there was the refactoring plan, the detailed design was done, the code was written in part, and the results were terminated because of some reason for refactoring.
JCM is a version that I have rewritten in my spare time in Java, which currently only implements the basic functionality. Because it is a fully distributed architecture, it can theoretically scale horizontally, greatly enhancing the service capability of the system.
Name Service
In a distributed system, a service typically deploys many instances in order to improve the overall service capability. Here I refer to these instances of providing the same service as clusters ( cluster ), each of which is called a node ( Node ). An app may use many cluster, and each time it accesses a cluster, it gets the cluster next available node through the name service. Well, the name service needs at least the features that are included:
- Get the available node based on the cluster name
- Health detection of all node under all cluster managed to ensure that the available node is always returned
Some name services only manage node, do not participate in communication between applications and node, while others may be used as communication forwarders between applications and node. Although the name Service function is simple, it is more complicated to make a distributed name service, because once the data is distributed, there will be synchronization and consistency issues.
What ' s JCM
JCM is implemented around the first name service-based functionality. Features that are included:
- Managing mappings of cluster to node
- Distributed architecture that scales horizontally to manage 10,000 node capabilities, enough to manage a common company's backend services cluster
- Health check for each node, health check can be based on HTTP protocol layer detection or TCP connection detection
- Persist Cluster/node data and ensure data consistency through zookeeper
- Provides a JSON HTTP API to manage Cluster/node data, followed by a Web management system
- Providing interaction with the server in the form of a library, the library itself provides a variety of load balancing policies to ensure that access to a cluster node is load balanced
Project Address git JCM
The JCM consists of two main parts:
- JCM.SERVER,JCM Name Service, need to connect zookeeper to persist data
- Jcm.subscriber, the client library, is responsible for interacting with Jcm.server, providing packaged load-balanced APIs for application use
Architecture
The overall architecture of the system based on JCM is as follows:
Cluster itself does not need to rely on JCM, to use these cluster through JCM, only the JCM HTTP API to register these cluster to Jcm.server. To use these cluster through jcm.server, it is done through Jcm.subscriber.
Use
Refer to Git readme.md
Need jre1.7+
- Start Zookeeper
- Download Jcm.server git Jcm.server-0.1.0.jar
jcm.server-0.1.0.jarset up files in the directory config/application.properties to configure, refer to Config/application.properties
Start Jcm.server
java -jar jcm.server-0.1.0.jar
To register a cluster that needs to be managed, refer to cluster Description: Doc/cluster_sample.json, registered via HTTP API:
curl -i -X POST http://10.181.97.106:8080/c -H "Content-Type:application/json" --data-binary @./doc/cluster_sample.json
Once the jcm.server has been deployed and cluster is registered, it can be used through Jcm.subscriber:
The cluster name Hello9/hello to be used, and the incoming JCM. ServerAddress, can be multiple:127.0. 0. 1:8080Subscriber subscriber = new Subscriber (Arrays. Aslist("127.0.0.1:8080"), Arrays. Aslist("Hello9","Hello"));Using the poll load balancing policy rrallocator rr = new Rrallocator ();Subscriber. AddListener(RR);Subscriber. Startup();for (int i =0; i < 2; ++i) {Rr. AllocGet the available node System based on the cluster name. out. println(RR. Alloc("Hello9", ProtoType. HTTP));}subscriber. Shutdown();
JCM implementation
JCM current implementation is relatively simple, refer to the module diagram:
- Model, the description of these data structures, is Cluster/node and jcm.server and Jcm.subscriber dependent
- Storage, persisting data to zookeeper with data synchronization between Jcm.server instances
- Health Check, healthy check module, health checks for each node
The above modules are not dependent on spring, and are based on the above modules:
- HTTP API, using SPRING-MVC, wraps up some JSON HTTP APIs
- Application, based on Spring-boot, assembles the various base modules to provide standalone mode start-up without deploying to servlet containers such as Tomcat
The implementation of Jcm.subscriber is simpler, primarily responsible for communicating with Jcm.server to update its current model layer data, while providing various load balancing policy interfaces:
- Subscriber, communication with jcm.server, regular incremental pull data
- Node allocator, listener way to get data from subscriber, while implementing a variety of load balancing strategies, externally unified
alloc node interface
Next look at the implementation of the key features
Data synchronization
Since Jcm.server is distributed, and every Jcm.server instance (instance) supports data read and write, how does each cluster synchronize data when Jcm.server manages a bunch of instance tens of thousands of node? There are two main types of data in Jcm.server:
- Data for the cluster itself, including descriptions of Cluster/node, such as cluster name, node IP, and other ancillary data
- Data for node Health check
For cluster data, because the cluster management of node is a two-layer tree structure, and the cluster has the operation of adding and deleting node, so it is not available on every instance to provide true data write, which will result in data loss. Assuming that both nodes N1 and N2 are added to cluster C1 at the same time on instance A and instance B, instance a writes C1 (N1), and instance B does not wait for data synchronization to write to C1 (N2), then C1 (N1) is overwritten with C1 (N2), which causes the added node N1 to be lost.
Therefore, the Jcm.server instance is divided leader and follower the true write operation is only leader, and follower is forwarded to leader when a write request is received. Leader Write data first update in-memory data and then write zookeeper, in-memory data update is of course need to lock mutually exclusive, so as to ensure the correctness of the data.
How do leader and follower determine roles? This is a simple, standard use of zookeeper to carry out the implementation of the master-slave elections.
The synchronization between Jcm.server instance data is based on the zookeeper watch mechanism. This can be counted as a JCM bottleneck, and every instance will be a watch, So that the actual jcm.server does not extend infinitely horizontally, to a certain extent, watch efficiency may not be enough to meet performance, reference zookeeper node number and watch performance test (at that time I was thinking about the restructuring of our system).
The same synchronization mechanism is used for node health check data in Jcm.server, but node health check data is written by every instance, so let's see how jcm.server is sharing stress through distributed architecture.
Health Check
Another major feature of Jcm.server is the health check on node, where the Jcm.server cluster can manage tens of thousands of of node, and since it is already distributed, it is obvious to divide node into multiple instance. Here I am assigning by cluster, by simply using a consistent hash. A consistent hash determines whether a cluster is owned by a instance. Each instance has a server spec, which is the address of the instance Service (Ip+port), so that all instance server spec it sees is the same on any instance, This ensures that the results of the allocation of cluster on each instance are consistent.
Health checks are broken down by cluster, which simplifies the writing conflict of data and, under normal circumstances, results in a different health check for each instance write.
The health check is usually performed at a frequency of 1 seconds, Jcm.server is optimized, and is not written to zookeeper when the result is the same as the previous one. The data written contains node's full key (IP+PORT+SPEC), which simplifies the data synchronization problem in many places, but increases the size of the write data and the size of the write data affects the performance of zookeeper, so it simply compresses the data.
Health checks can be implemented with multiple checks, and only the HTTP protocol layer is currently being checked. The health check itself is a single thread that initiates an asynchronous request based on an asynchronous HTTP library, and the actual request is made in another thread.
Jcm.subscriber Communication
Jcm.subscriber communicates with Jcm.server primarily to obtain the latest cluster data. Subscriber is initialized with a Jcm.server instance address list that is accessed using a polling policy to balance the payload of jcm.server when processing requests. Subscriber requests data once per second, which describes what cluster data the request wants to get, along with a cluster version. Version changes (timestamp) every time cluster is changed at the server. When the server replies to the request, if version is up-to-date, only the status of node will be restored.
Subscriber can get all the status of node, you can consider only the normal state of node, further reduce the size of the data.
Pressure test
At present, only the health inspection part of the pressure test, detailed reference test/benchmark.md. Test on the A7 server, found that write zookeeper and zookeeper watch enough to meet the requirements, Jcm.server initiated HTTP request is the main performance hotspot, single jcm.server Instance can probably carry 20,000 node health monitoring.
Network bandwidth:
Time---------------------Traffic--------------------time bytin bytout pktin pktout pkterr Pktdrp on/ -/ the- +: -: - 3.2M4.1M33.5K 34.4K 0.00 0.00 on/ -/ the- +: -: - 3.3M4.2M33.7K 35.9K 0.00 0.00 on/ -/ the- +: -: the 2.8M4.1M32.6K 41.6K 0.00 0.00
The CPU, through Jstack, looks at the primary CPU consumption in the HTTP library implementation layer, as well as the health check thread:
PID USER PR NI VIRT RES SHR S%CPU%MEM Time+COMMAND13301Admin20 0 13. 1g1. 1g12m R76. 62. 32:. the Java Httpchecker13300Admin20 0 13. 1g1. 1g12m S72. 92. 30:. to Java13275Admin20 0 13. 1g1. 1g12m S20. 12. 30:. Java
The code adds some status monitoring:
count20avg542.05avg41.35
Indicates that the average check time is 542 milliseconds, writing data because the cache has no reference value.
Although I can also do a lot of optimization from my own code, but since the single machine can carry 20,000 nodes of detection, the general application is far enough.
Summarize
Although the name service is the basic service in the distributed system, it often assumes a very important role, the data synchronization error, the node state instantaneous error, all may have a large impact on the whole system, the business of a large failure. So the robustness and usability of the name service is very important. There are many exceptions to be considered in the implementation, including network instability, application layer errors, and so on. In order to improve availability, it is common to add multiple layers of data caches, such as local caches on the local cache,server side of the subscriber side, to ensure that the application tier services are not affected under any circumstances.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
wrote a distributed name Service JCM