Information System Practice Notes 8-Two module communication some things

Source: Internet
Author: User
Tags md5 digest

Description: Information System Practice Notes series is the author in peacetime research and development has encountered the size of the problem, perhaps simple and subtle, but often is often encountered problems. The author is more typical of which to collect, describe, summarize and share.

Absrtact: This article describes the interface between the information system or the platform which the author has contacted, and exhaustive to share it.


Series of essays Directory: Information System Practice notes (


Reprint instructions: Please specify the original author, connection, and source.


In the process of information system development and docking, it is unavoidable to encounter the communication between the two modules, the 2 modules of the home development, or their own modules and a third-party communications module.

Here try not to involve the specific technology (this people find their own information to read), mainly to break up some of the thoughts and circumstances involved.

1.2 Levels of Docking protocol :

Two code entities (modules) need to communicate with each other, often on different hosts (PC/IP), and occasionally on the same host, the TCP/IP protocol has shielded the physical machine differences, through the ip/port to differentiate resources, distinguish between modules or functional peer entities. Back, 2 modules to communicate, 2-terminal code to interact with information (data/message/signaling/BITSTREAM/structure/As you call it), referring to the current Internet TCP/IP protocol or other strange protocol, basically divided into two levels:

A: The physical implementation layer (for the physical layer in TCP/IP, link layer, etc.) of the bitstream, binary, high and low levels, network cards, networking equipment and other support this protocol. For example, 3 volts/12 volts for 0 or 1 and so on, this is electrical characteristics, reliability depends on the operating characteristics of the hardware, electrical characteristics, and error correction methods (some mathematical methods).

B: High-level (unified called high-level, that is, that's what this means), based on the binary bit hierarchy of the meaning of the level, byte-stream level, bytes (generally 8bit=1byte), and byte corresponds to a byte encoding specification, such as ASCII or UTF-8, Such a string of bytes logical meaning (26 letters and case, natural Number 10 digits, special symbols, carriage return line, subtraction symbols, invisible symbols, etc.); in fact, the network is the transmission of the word stream, and TCP/IP protocol is mainly to deal with this byte stream from the end-to-end transmission , reliability depends on the bit stream of lower a, and can also increase the proper calibration of the CRC and other means (also some mathematical methods)

So although the Iso/osi 7-layer network model design is very good, actually implemented to the actual implementation of the TCP/IP protocol, is mainly divided into the above A/b two levels, a also subdivided several, B also has a subdivision but the use of not much; the main understanding of these 2 levels, from the physical bit of reliable transmission, The schema has its logical meaning of high-level byte transmission, which is a concept. Follow-up mainly on the B-layer, after all, a level is now stable, unless quantum mechanics to subvert the current computer prototype (node PC), and then naturally may subvert the network Foundation physical layer a electrical characteristics, make some advanced, this we will not talk about, and 5-10 years on the blow.

Here we talk about layer B;

The 2.TCP/IP protocol is mainly divided into tcp/udp2 species . :

The TCP/IP protocol is divided into 2 main types, udp/tcp:

(1) TCP: Stateful byte stream communication for connection (logical connection channel maintained on Network Foundation layer a), guaranteeing the sequential and correct transmission of packets, having status, so it can be erroneously re-transmitted, and maintain the connection channel; but the cache and send of datagrams depend on "logical connection channel" Both ends of the module program (often the OS of the network driver, NIC driver, etc.) autonomous decision. So there will be sticky packet and how to split the problem, this is the TCP classic problem, please check the information (or see Netty official website of the Userguide introductory chapter, there is a relatively simple classic description);

Scene: the need for reliable connection, direct use of the scene, sacrificing a certain efficiency, more extensive;

(2) UDP: No connection, does not guarantee that the packet in the order of delivery to reach the destination, or even not guaranteed to reach the destination; it actually allows the 2-terminal module itself through the high-level protocol to organize, the sequence of packet inspection, packet loss retransmission, and state flow, etc. equals to achieve a small simple tcp;

Scenario: The network environment table is stable and reliable, the data correctness and sequence are not particularly stringent requirements, high efficiency, such as LAN video playback, etc.;

(Note: With the network hardware performance, the module is located on the node PC or server performance improvement, we may feel that even if you see the video with TCP to more convenient and effective, it is also possible, this is not a conclusion, see the specific situation and the project needs and implementation of the situation will be determined)

3. Several ways to use TCP

TCP is easy to use, but one feature is sticky bags. Although it is connection-oriented, stateful maintenance management, packet retransmission, and order of guarantee. However, the network-driven bottom-level libraries of the Zhuang modules on the two ends, the respective algorithms, cache caches and policies, the efficiency, the size of the packet (MTU) and so on are different, resulting in the packets received by the receiver being only consistent on the whole byte stream bytes. However, it is not possible to know how the sender each send to the underlying byte byte is divided. This is the classic sticky bag and slicing problem, generally have the following several methods to deal with;

(a) Method 1: Fixed Length field method;

This is not afraid of sticky bag, send the end of the hair, receive the cache from the receiving polling, if full of length, read into; Here, of course, assuming that TCP is ordered, packet loss will be re-transmitted.

Advantages: Simple to achieve, logic is simple;

Disadvantage: Once the TCP transmission error (bad network, probabilistic, protocol stack defects, etc., always wrong), the whole will be offset, the data dislocation will be wrong;

Optimization: According to its shortcomings, can be added to the fixed length of the special head marker, such as fixed-length length=64bytes, which began to mark Flag=\x0c, and so on, when satisfied with the two, it is decoded to the logical content, the content is correct, the content error is discarded, and find the next valid fixed-length data slice (and start with flag); This is actually a bit of a prototype of TLV.

(b) Method 2: Delimiter method (Variable-length field method);

and the fixed-length field method is the opposite, is through a delimiter to distinguish between the two ends of the content, then no broken length can be changed, the delimiter can not be duplicated, also solve the problem of sticky packet;

A bit: simple implementation, logic simpler, more reliable;

Cons: To ensure that separators are not included in the payload (payload, i.e. actual load and transmitted data)! Moreover, the algorithm is inefficient and requires every byte to check whether it is a delimiter;

Optimization: If you can match the fixed length of the data slice, it is actually equivalent to adding a flag header.

(C) method 3:TLV mode (classic efficiency):

TLV is a more classic way to use relatively more. TLV (generally referred to as Type-length-value). Its protocol generally agreed to a head header structure, including flag and Length,flag used to find the TLV head (is a token byte, not unique, but not many), after finding the flag, according to the Protocol Convention to resolve the head header structure, parsing failed to continue to find flag; after successful parsing , the whole length of the head+body can be obtained, and the data of the body is parsed, and in the body, a set of indefinite long (LENGTH-VALUE) structures, a length of one value to carry the payload, However, the length field itself is fixed, usually between 1 and 4 bytes, to see the overall protocol definition of TLV; the length of the body and the lengths of the head are not necessarily consistent. In addition, this type is the meaning of the types, is the agreement of the TLV Protocol manual, it is generally explained that there are K group <length,value>, each group is int,long,double,bytes[], and so on;

Advantages: Efficient acquisition of value through the Length field, without checking each byte, and supporting different data structures;

Disadvantage: The implementation is more complex, the two sides TLV protocol manual is obviously more than the first 2 methods of description to be detailed;

Improved: You can use the TLV protocol as a "configuration file" by using a reflection or other method such as Java to allow the program to automatically generate a codec (CODEC) from a profile that is clearly and easily maintained.

4. Some instructions for the TCP heartbeat

TCP protocol content, there are several large books, but the core content, the Web also reproduced a lot, here only to mention some content.

TCP If you do not give time-out timeout setting when initializing the socket (see what timeouts the TCP class library supports, typically Java Io/nio supports connection timeouts, transmission timeouts, etc.), then the default is 2 hours (120 minutes), Within 120 minutes of the TCP protocol itself has a heartbeat to maintain the socket connection channel, more than 120 minutes, according to the state machine changes, both ends begin to dismantle the TCP connection. Of course, if the two dignified module development if not good, this will lead to a large number of ports time_wait and so on the TCP state machine can not smooth, fast, efficient demolition and reuse, resulting in resource consumption, and even consume 65,535 port port.

In addition to the TCP itself timeout, or the network Lib library ability to give a connection timeout, transmission timeout (idle idle timeout), the pile module can artificially add their own time-out and heartbeat mechanism in the logical layer. Examples are as follows:

(1) Receive ping, send Pong, receive pong, do not respond; This is the peer heartbeat detection packet, there is an initiator to detect whether the other side is alive alive;

(2) on both sides of the agreed heartbeat interval, such as every 120 seconds to send heartbeat packets (such as "HB" 2 characters, or some kind of byte, etc.), or the Convention only server to the client, do not reverse;

Special Note: This is the high-level logic of the heartbeat, the application layer of its own use, and its implementation according to the above 3 different methods of use, such as you use TLV, then also need to define a "heartbeat packet" protocol;

5. What kind of libraries are used?

Generally when the development of the pile module, the two ends of the agreement good transport protocols, such as TLV, then the respective development. You use familiar language such as Java, and further use of out-of-the-box frameworks, such as Netty (asynchronous IO framework), so you do not have to use the Naked Java NiO yourself step-by-step, do not have to reinvent the wheel. Netty is a very useful library, developers have also developed the Mina, two libraries a bit similar, the advantages and the emphasis. See my [itis-data collection paste] in the introduction and e-book;

6. Security issues with docking protocols

TCP is connected, stateful, ordered, and packet-transmitted, so he has the underlying correctness guarantee. However, TCP is completely subject to retransmission attacks, or interception, or tampering with data packets. Moreover, TCP guarantees that the byte stream of the network layer is correctly arrived, and that the upper logic (application layer) is not guaranteed to be accurate. Therefore, there are the following suggestions:

(1) can be based on the needs of their own data encoding, compression, encryption;

(2) can refer to some open source key, encryption algorithm, authentication, certificate, such as the use of the Protocol;

(3) Once done, to payload to Base64, and made a MD5 digest, to the end of the MD5 to verify that the data is correct, but also as a key value of asynchronous feedback.

Summary, the TCP protocol is always rich and colorful, the core content is these, if familiar with some frameworks and common means, plus some such as MD5, encryption compression algorithm, etc., can build and implement different transmission protocols, meet the requirements of different situations, in two to the end of the pile module to achieve data transfer.


Information System Practice Notes 8-Two module communication some things

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.