Sixth Chapter Network communication
The Laxcus Big Data Management System network is built on the TCP/IP network, starting with version 2.0 and supporting IPV4 and IPV6 two network addresses. Network communication is the most basic and important part of laxcus system, in order to make use of limited network resources, to obtain maximum usage efficiency, we design a set of exclusive network communication protocols based on the characteristics of Big Data network environment, as well as several sets of network communication schemes based on this Protocol, Together, they form the basis of the network communication of the Laxcus cluster. This chapter will start with the TCP/IP protocol and introduce the various components related to network communication.
6.1 fixp protocol
Laxcus uses FIXP protocol communication. The full name of the FIXP agreement is the "Free Information exchange protocol (information exchange Protocol)" protocol. This is a set of TCP/IP protocol based on the binary Application Layer Communication protocol, binary word sequence using small head coding (Little Endian), the Protocol has platform independence, context-independent, simple structure, data size and so on.
6.1.1 protocol structure
As shown in 6.1, the protocol structure layout consists of three parts in the order of Arrangement: command, message, data entity. There are two types of commands: request and reply, and the function of the command is to describe the basic properties of this communication. Each communication is sent by the initiator to the request command, and the receiving party returns the answer command. The message appears after the command, and the message is allowed to appear in a single communication protocol, with multiple classes of ancillary information required for this communication in the message. Messages are cohesive between each other, with no delimited tags, distinguished by the length of the markers in the message header. At the very end of the Data Entity section, the data entity contains the content to be passed by this communication. These can be in any format, such as audio, images, database data, various meta-data, and so on. The data entity is an optional part, and the presence is indicated in the message. For example, the communication initiator usually does not need to pass the data entity.
Figure 6.1 FIXP Protocol structure
6.1.2 command structure
6.2, the command is a 56-bit (7-byte) sequence of numbers. The function of the first 8-bit identification is to differentiate between the currently requested command or the answer command. After the protocol version number occupies 16 bits, the protocol version number is variable, different protocol version number represents a different protocol format, in the application has a different interpretation. The current version of the agreement is 0x100. The main difference in the command is in the 24th to 40th bit, the request command needs to provide two 8-bit main commands and from the command, indicating the purpose of this operation, the answer command returns a 16-bit response code to confirm whether the request was accepted or rejected for other reasons. The last is the number of 16-bit message members, which in theory can carry up to 65,535 messages at a time of fixp communication.
Figure 6.2 Command (Request/reply) structure
6.1.3 message structure
6.3, the message is an indeterminate data structure, consisting of a key, a type, a parameter length, and a parameter. The key occupies 16 bits, each key has a fixed definition, the key theoretically has 65,536, currently has used about 100. The type occupies 4 bits, indicating subsequent parameter properties, including Boolean, short Integer, Integer, Long Integer, single floating point, double floating point, binary array, string, compressed binary array, compressed string. The parameter length is a 12-bit value, and the actual size of the parameter is illustrated by the parameter length. It should be noted that numeric parameters have a word-length compression capability, such as an integer number 0x20, which takes up to 4 bytes according to the computer's word-length standard, but the actual size is only 1 bytes. At this point the parameter length is described as 1, ignoring the previous 3 0. As described at the beginning of this chapter, numeric parameters follow the small-Print header format (Little Endian).
Figure 6.3 Message Structure
6.2 Communication Solutions
We offer four communication solutions based on the FIXP agreement. These communication schemes will realize differentiated communication according to the different requirements of environmental conditions and tasks, in order to save network traffic, reduce running load and improve computing efficiency.
6.2.1 TCP communication
TCP communication is built on the TCP stack of TCP/IP protocol, which is mainly used to deal with high continuous and large traffic data. such as the distribution of data blocks, as well as the Diffuse/converge distribution calculation of the transferred data. In the Laxcus cluster, they are the main traffic, occupy a large amount of network bandwidth, serious time will occur network congestion, affecting the normal operation of the cluster. In order to avoid this phenomenon, TCP communication is limited by the flow control mechanism, and by adopting the method of reducing data transmission, free up some network bandwidth to ensure the data transmission and stable operation of other communication services.
6.2.2 UDP communication
UDP communication is based on the UDP stack of TCP/IP protocol, which is mainly for non-sustainable, low-reliability and small-traffic data transmission. In the Laxcus cluster, the FIXP protocol packet based on UDP transmission, the data size is generally between 20 to 300 bytes, less than the maximum Transmission Unit (MTU) of an IP packet, in which the heartbeat packet of the test node State is the most common one. At present, UDP communication is the most frequently used communication scheme in Laxcus cluster.
6.2.3 KEEP UDP communication
The advantage of UDP is that the resource occupancy rate of the computer is low, the disadvantage is the data communication is unstable, there is packet loss phenomenon. TCP, on the contrary, can provide a stable data communication channel, but has a high resource occupancy rate on the TCP/IP stack. In the Laxcus cluster, there is a large number that need to maintain stable communication, but also want to use UDP network communication services. The answer is "KEEP UDP (Sustainable Packet communication)" to avoid their drawbacks in the case of having both advantages. KEEP UDP is a transition scheme between TCP and UDP, which is designed for Laxcus trunking network communication, and provides a stable communication guarantee for UDP data by simulating TCP communication process on UDP basis. The essence of this scheme is to move the packet and reassembly of packets that were originally made on the TCP/IP stack to Laxcus-controlled worker threads. While reducing the TCP/IP stack pressure, it is also possible to freely define some special rules for packages based on the requirements of the time. Currently keep UDP is mainly used to perform RPC processing and transport network logs, which are small data traffic but require reliable transmission of communications services.
6.2.4 RPC communication
The appearance of RPC (remote process Call) is a very good network communication scheme, and it is still widely used today. It makes network calls between two computers on a network similar to the process of local API calls by hiding the traffic on both sides of the network. This greatly simplifies the programmer to the network programming difficulty, improves the work efficiency, reduces the error the opportunity.
The laxcus contains an implementation of RPC, which is based on TCP and keep UDP communication, and implements RPC invocation processing by embedding the interface locally and masking the network process to the programmer. At present, many complex and high-security network communication in Laxcus cluster are implemented by RPC scheme.
6.3 Communication detection
Many of the failures that occur during cluster operation are related to network and network devices. According to statistics, these failures include: line damage, loose socket, electromagnetic effects, network congestion, network equipment damage. Some of these are hardware failures, and some are transient network failures. An effective means of judging a fault is to detect the network by sending an ICMP packet. This test can be handled by a single machine, requiring multiple nodes to test an address together, if necessary, and then summarizing the test results to get answers. The system will determine if the fault is a transient network problem or an unrecoverable physical failure. If the problem is serious, it will be reported to the system administrator to resolve the problem by manual processing. Communication detection is performed on all nodes, and it is a necessary means to embody weak centrality and self-sustainment ability of cluster.
6.4 Communications Server
As described in section 1.3, the Communications Server is a working module under node management and communicates using the FIXP protocol. The communications Server binds the TCP/UDP two-mode listening sockets (sockets) at startup, and the socket parameters are defined in the configuration file. According to the system, the socket address of the working node is randomly selected by the system at startup, and the socket of the management node must have a fixed IP address and port. Because only the address of the management node is fixed, the work node can find the management node on the network. Communications Servers do not actively initiate communication work, only receive externally issued commands. After the command is received, the task thread assigned to the subordinate completes the specific task processing. The communication server also undertakes the function of network communication security, ensuring that the data transmitted on both sides of the network is correct and trustworthy during the communication process. The security management of a communications Server is an option that is set in the configuration file and whether it is used by the user.
6.5 Global time
In the process of network communication, in order to be able to discern the order of data processing between nodes, a uniform parameter is needed to identify where they are at that time. This parameter is called the global time, also known as the primary clock or the timeline. The global time is the standard for the operating system time of the top master state node in the cluster, and all other nodes must conform to this time definition, consistent with the top master node. Global time is requested and obtained from the parent management node when the node is started, and is set on the local operating system for no more than 1 seconds. Global time is now used in the network log, Network Computing, and the main block conflict, data redundancy processing.
6.6 Flow control
In the cause of instability in the cluster, a large part of the reason is that the network traffic is too large, if you can control the traffic of each data traffic, so that they at a fair and reasonable rate of transmission of data, to improve the stability of the cluster operation, will have a great role in promoting. Laxcus uses "equal/stop transmission mechanism" to control the network transmission rate of each work, which is a TCP/IP application layer technology, is part of "Invoke/produce" task scheduling model, and has the ability to judge network traffic and error retransmission in real time. According to the current network conditions, choose the appropriate transmission rate to transfer data, if the packet loss rate increases, indicating that the network load is too heavy, it will delay the data transmission interval. Flow control is transparent to the upper layer without any management control measures. At present, all the data processing services of Laxcus cluster, the network communication is using the "equal/stop transmission mechanism" by default. According to our detection of various data traffic, when the network traffic is enabled "equal/stop transmission mechanism", the network transfer rate is not enabled before the 70%-84%, but the network in the face of heavy-load data communication, its adaptability enhanced. Therefore, in general, this is beneficial to improve the stability of the system.
Laxcus Big Data Management System 2.0 (8)-sixth chapter network communication