Common pitfalls and traps for pure Socket (BIO) long link programming, socketbio

Source: Internet
Author: User

Common pitfalls and traps for pure Socket (BIO) long link programming, socketbio
This article is purely a summary of my personal experience. The pseudocode is also coded on the whiteboard when I write an article. There may be logic problems. please correct me. Thank you.

Internet (Global Internet) is produced by communication between countless machines based on TCP/IP protocol families. TCP/IP protocol families are implemented in four layers: link layer, network layer, transmission layer, and application layer.

The most common contact with our application developers should be the application layer. For example, HTTP protocol is widely used in web applications, and HTTP protocol has helped our developers do a lot of things, HTTP is enough to complete most of the communication work, but sometimes there are some special scenarios that make the HTTP protocol not easy to complete, at this time, we need to find other application layer protocols to adapt to the scenario. At the initial stage of the project, you must select an appropriate transmission protocol based on the business scenario and runtime environment. For example, the common publishing/subscription scenario is that the PUSH Service can use MQTT and other protocols, you can use FTP or other protocols for file transmission. However, what we are talking about this time is not how to choose a communication protocol, but how to implement a set of custom communication protocols. I have been so long-winded above, and now I am entering the topic. BodyTo implement a custom application layer protocol, that is, to develop the transport layer protocol, the transport layer has two protocols, TCP and UDP. For the differences between the two and the applicable scenarios, seach, TCP transmission is reliable. UDP transmission generally selects TCP regardless of whether the data is delivered. This article also describes the TCP mode. As mentioned above, TCP/IP is a protocol, that is, a convention. How can we program this convention? In fact, the operating system has already done this, and it is very elegant to provide us with a convenient way to use (Socket API), that is, we often say Socket. C/C ++ can be used to directly call the API of the operating system for operations. JAVA and other languages that require virtual machines can call the API provided by the SDK for development. Well, it's useless. I believe many people are impatient. Let's talk about routines first. Use pseudocode to describe where code is used. pseudocode can easily pass the logic of ideas and so on. Well, it's actually a lazy job (a newbie who writes blog posts asks: Do you need to sell it here ?). Real Text Common pitfalls1. the received information is incomplete, or more than expected (half package, stick package) 2. BIO read/write blocking causes thread suspension 3. The physical link is accidentally disconnected, the program cannot detect exceptions and cause suspension. 4. multi-thread sharing of the same socket causes data disorder. 5. Using a large number of idle Sockets for a long time has some drawbacks. Therefore, I have summarized several routines, it should also be something that the communication middleware should do. Common routines1. Set the Message format (solving semi-stick packets) 2. Define the communication workflow (using the queue, thread pool, and connection pool properly) 3. added the heartbeat detection mechanism (Resolving connection unavailability due to abnormal disconnection) 4. added the link recycling mechanism (terminating the link according to idle or timeout Rules) 5. Exception Handling (close the connection in time when a non-reusable exception occurs) 6. Instant resend (when the connection is unavailable, another connection is used for instant resend) We design each step of the routine, the potential problems and scalability are fully considered. 1. Message format (Message)When it comes to communication, there must be protocols. Just like communication between people or other animals, people have language rules, and other animals have their language rules, machine Communication also requires "language" rules. Setting packets is to set machine language rules. The purpose of Machine Communication is to acquire data, send data, and send instructions. Both of them initiate requests, and one party processes and responds. First, machines are not human beings. They are not so intelligent and cannot understand when you can talk about them. Therefore, let machine B know that machine A's data transmission has ended, you need to ask them to make an appointment. Before talking, tell the other party how much to say. Then machine A first tells machine B that I want to say one sentence (one line), or I want to say 10 letters (10 bytes ), at this time, machine B can receive messages from machine A according to the length of the message from machine, Fixed the problem of half-pack sticking.. As mentioned above, the first thing they need to agree on is to tell the other party what to say this time. One of the agreed rules is the message length. Since there is a length, there is only one and two. Let's agree on something else. For example, this time I want to ask you for data, send you some data, execute a command, or execute other tasks. Then another information is added to the rule, that is, the behavior identifier. The message length and behavior identifier are the basic attributes of a message. Are there other attributes? Of course, this depends on how detailed the rule is, but it is not that the longer the better, but that the simpler the basic things, the better, after all, there are many things. First, the parsing is slow, and second, the data packet will become larger. At this point, we should also understand how the packets should be formulated. Below we will provide a simple basic Socket communication message format, for your reference. It can be used for both requests and responses.
Sequence Field name Length (bytes) Field Type Description
1 Message Length 4 (32bit) Int The maximum length of a socket packet is 2 ^ 31-1 bytes. This field is not used for transmission of large files.
2 Behavior ID 1 (8bit) Byte 1 byte of data processed by the branch can identify 256 kinds of behavior, which is generally enough
3 Encrypted ID 1 (8bit) Byte Distinguish encryption mode 0 not encrypted
4 Timestamp 8 (64bit) Long The message timestamp is actually useless. Please ignore it.
5 Message Body   String The message length is-10 bytes. json is recommended. The specific parsing behavior is defined by the Behavior Identification field.

 

 

2. Define the communication Workflow

Because BIO communication is not so flexible, we recommend that one Socket connection be operated by only one thread at the same time, and the same ServerSocket is only used for active requests or passive reception, this will reduce the mess caused by network factors, and I don't know what it will become. ¥ @ U! % 1 # % fa23 & % 3 9 & + ...... Something happened. In fact, we can also use message subcontracting or lock the socket object for multi-threaded use, but I am too lazy to deal with such a thing, it is not necessary (well, it is actually still lazy ). The workflow is as follows:1. When sending data, the client enters the read status and waits for a response. 2. the passive thread blocks and waits for data. After reading the first 14 bytes of length, the thread performs preliminary parsing and processes the data according to the behavior ID or encrypted ID. After processing, respond to a message and wait for the data. The Code is as follows:For performance, you can use the queue + thread pool + connection pool to cooperate with each other. If you want to discuss this, you can trust me or comment and discuss it together. 1,Basic Encapsulation
/** Message package (packet) **/class SocketPackage {int length; // length byte action; // behavior ID byte encryption; // encryption Id long timestamp; // timestamp String data; // message body/** TODO: Convert the message package to an appropriate byte array **/byte [] toBytes () {byte [] lengthBytes = int2bytes (length );//... after converting all fields to bytes, merge the byte array and return}/** TODO: convert the read input stream into a message package **/static SocketPackage parse (InputStream in) throws IOException {SocketPackage sp = new SocketPackage (); byte [] lengthBytes = new byte [4]; in. read (lengthBytes); // This step will block the sp. length = bytes2int (lengthBytes );//..... read Other fields and stop writing them. Control exceptions here. Do not catch them at will. If an exception occurs, it means a message exception if the socket is not broken, an exception should be thrown to the other party in the form of a rejected connection }}
/** Encapsulate the socket so that it can save more connection information. Do not tangle with the name. After a long time, I don't know how to name it. It's a pseudo-code, write it like this **/class NiuxzSocket {Socket socket; volatile long lastUse; // last use time //... other attributes can be added here, such as whether it is the write status, the write operation start time, the last non-Heartbeat package time, and other NiuxzSocket (Socket socket Socket) {this. socket = socket; this. lastUse = System. currentTimeMillis ();} InputStream getIn () {return socket. getInputStream ();} void write (byte [] bytes) throws IOException {this. socket. getOutputStream (). write (bytes );}}

 

2,Active end: the Core of the active end is the connection pool SocketPool and SocketClient. The general process is to call SocketClient to send data packets. The SocketClient obtains an available connection from the connection pool. If no available connection exists, it creates one. SocketClient operates NiuxzSocket based on the business type or message type.
/** Encapsulate an interface for sending information and provide common sending methods. **/Interface SocketClient {SocketPackage sendData (SocketPackage sp); // send a message package and wait for the returned message package // TODO: you can also add several more convenient interface methods based on the business and protocol of both parties. For example, return only the message body field, or directly return the void sendHeartBeat (NiuxzSocket socket) of the json content; // send a heartbeat packet, this method will be used later when talking about heartbeat packets} class DefaultSocketClient implements SocketClient {SocketPool socketPool; // first pretend to have a socket connection pool to manage the socket. If the connection pool is not used, you can directly inject a NiuxzSocket here. The following code also uses socket directly, but the lock operation must be performed during use. Otherwise, multi-thread access to the same socket may cause data disorder. /** This method is the active end work entry. The Business Code can call this method to send data directly **/SocketPackage sendData (SocketPackage sp) {NiuxzSocket niuxzSocket = socketPool. get (); // get a socket. here we can see that the obtained socket is not a native socket, but it is actually the socket try {niuxzSocket after our own encapsulation. write (sp. toBytes (); // blocking continues to write to the cache niuxzSocket. lastUse = System. currentTimeMillis (); // update the socket status information based on the business method SocketPackage sp = SocketPackage. parse (niuxzSocket. getIn (); // blocking reading, waiting for the message to return, because it is a single-thread socket operation, so there is no message Queue insertion. Return sp;} catch (Exception e) {LOG. error ("failed to send message packet", e); socketPool. destroy (niuxzSocket) // disable the socket and destroy the NiuxzSocke in case of an exception that cannot be reused. The retainable exception means that I/O operations are half done and I don't know where they are, so the entire socket is unavailable.} Finally {if (socketPool! = Null) {socketPool. recycle (niuxzSocket); // do not close this socket after it is used, because it needs to be reused so that the connection pool can reclaim this socket. In the recycle, determine whether the socket is destroyed. }}}}
/** Define a connection pool interface SocketPool **/interface SocketPool {/** get a connection **/NiuxzSocket get (); /** reclaim Socket **/void recycle (NiuxzSocket ns);/** destroy Socket **/void destroy (NiuxzSocket ns );} /** implement connection pool **/class DefaultSocketPool implements SocketPool {BlockingQueue <NiuxzSocket> sockets; // the container that stores the socket, or use the array NiuxzSocket get () {// TODO: if a thread is not created in the pool, you can use synchronized/wait or Lock/condition} // TODO: SocketPool, implementing the connection pool is a performance reliability optimization, there will be more to do. It's easy for everyone to understand and implement it. I will write another article after I sort out my connection pool code. If I want to know something, I will comment on it .}
  3,The core of the passive end is NiuxzServer and Worker and SocketHandler. The general process is to enable the port waiting for connection, accept the connection creation thread, and reach the maximum number of threads, the connection is denied, the connection starts to read data, and the data is read to the branch for processing. After processing, the result is returned to the active end to complete an interaction. Continue reading.
/** Enable a ServerSocket and wait for the connection. After the connection, enable a thread to process it. **/class NiuxzServer {ServerSocket serverSocket; HashMap <NiuxzSocket> sockets = new HashMap <NiuxzSocket> (); public static AtomicInteger workerCount = 0; public Object waitLock = new Object (); int maxWorkerCount = 100; // allow 100 connections to enter the int port; // configure a port number/** work entry **/void work () {serverSocket = new ServerSocket (port); while (true) {Socket socekt = serverSocket. accept (); // Block and wait for the connection NiuxzSocket niuxzSocket = new NiuxzSocket (socket); sockets. put (niuxzSocket, 1); // put the connection into the map Worker worker = new Worker (niuxzSocket); // create a worker thread Worker. start (); // start thread while (true) {if (workerCount. incrementAndGet ()> = maxWorkerCount) {// if the maximum number of threads is exceeded, wait for other connections to destroy synchronized (waitLock) {if (workerCount. incrementAndGet ()> = maxWorkerCount) {// double check determines that the socket waitLock is not being disconnected before waiting. wait ();} else {Break ;}} else {break ;}}}/** destroy a connection **/void destroy (NiuxzSocket socket) {synchronized (waitLock) {sockets. remove (socket); // Delete workerCount from the pool. decrementAndGet (); // The number of current connections minus waitLock. Y (); // The Notification work method can continue to accept the request}/** create a worker Thread class, process the connected socket **/class Worker extends Thread {HashMap <Integer, SocketHandler> handlers; // message processor for each behavior identifier. NiuxzSocket socket; Worker (NiuxzSocketsocket) {// constructor this. socket = socket;} void run () {try {while (true) {SocketPackage sp = SocketPackage. parse (socket. getIn (); // block read until an unknown message package is read, which can solve the problem of sticking or half-package. SocketHandler handler = handlers. get (sp. getAction (); // The processor handler that obtains the response according to the behavior ID. handle (sp, socket); // The processing result and response information are written back in handler} cache (Exception e) {LOG. error ("connection exception interrupted", e); NiuxzServer. destroy (socket );}}}}
/** Create a message processor SocketHandler to receive all the content and then Echo **/class EchoSocketHandler implements SocketHandler {/** handle socket requests **/void handle (SocketPackage sp, NiuxzSocket socket) {sp. setAction (10); // For example, the behavior id 10 in the Protocol indicates that the response is successful. write (sp. toBytes (); // Direct Write-back }}

The working code at both ends has been initially completed. The socket can communicate with each other according to their respective communication methods.

3. Heartbeat mechanism:

  Heartbeat mechanism is an indispensable mechanism in socket persistent connection communication. The active side can check whether the socket is alive, and the passive side can check whether the other side is online. Because sometimes the network is not necessarily so perfect, there will be exceptions on the link. At this time, the application layer may not be able to find any problems, and the next time you use this connection, an exception will be thrown, if it is a passive end, it will still occupy a thread in vain. It is better to find some exceptions before that and destroy the connection. The probability of errors in the next communication will be much lower, the passive end also releases threads and resources.

The code can be implemented as follows: the active end performs a scheduled task traversal to determine whether the last time used by all connections in the connection pool has exceeded the heartbeat packet interval, when the threshold is exceeded, the socket is taken out and a thread is enabled (it is best to use the thread pool), and a heartbeat packet is sent in the thread.
@ Scheduled (fixedDelay = 30*1000) // The void HeartBeat () {for (NiuxzSocket socket: socketPool. getAllSocket () {if (System. curTime ()-socket. getLastUse ()> 30*1000) {// if the system time minus the last time used more than 30 seconds, // start the thread and remove the connection from the connection pool (socket) after successful removal, continue to ensure that no other thread uses this socket at the same time. Send a SocketPackage, socketClient. sendHeartBeat () if (socketPool. remove (socket) {socketClient. snedHeartBeat (socket); // socketClient. snedHeartBeat: The behavior identifier is set to the heartbeat packet. For example, rule 1 is the heartbeat packet. When the link socketPool. recycle (socket) is closed, the socketPool. destroy (socket) is destroyed when an exception occurs in the middle of the link ). }}}}
Passive end: like the active end, the connection pool is periodically scanned. However, if a connection that exceeds the specified idle timeout, the heartbeat packet is not sent, but is destroyed directly. After the socket is closed, the thread that is reading will read the EOF (-1) and stop the thread. The specified Timeout must be greater than the specified interval of the heartbeat packet. 4. Instant resend:Optimize the SocketClient. If an exception occurs during each sending, execute it once or twice after the current socket is destroyed. After several retries, if no exception is thrown. 5. Perfect filling:Through the above work, we have actually solved problems 1, 3, and 4. The length of the message is used to solve the problem of semi-package sticking. The connection pool of the client or the socket Lock operation is used to solve the problem of data disorder when multiple threads access the socket (okay, no, no .. Therefore, we recommend using the connection pool to increase throughput ). There are also issues 2 and 5. In fact, we can use a Sokcet health check task (which can also be combined with the heartbeat detection task to change the delay time of the heartbeat task to Ms or lower) to traverse the connection pool and determine the information of each connection, determine whether or not each status is abnormal, and then decide whether to close the socket. For example, in question 1, when reading and writing operations, the other party may be stuck, resulting in a long time not processing the task or the other party suspended, and will not continue to receive or write back the information, in this case, it would be better to have a timeout mechanism. Fortunately, the java socket has the setSoTimeout method. You can set the read timeout time to 30 s for the active end, if the passive end does not respond, a timeout exception will be thrown. At this time, we will destroy this socket. However, the java socket does not provide write timeout settings. When writing data to the passive end, the passive end receives data slowly or causes any problems, causing no data reception at all, this will cause the write thread to remain suspended. We certainly do not want this to happen, so we can record the current time before writing and change the socket to the write status, then, in the Sokcet health check task, determine whether the socket is written and whether the time exceeds xx seconds to close the socket. It is easier to close the idle socket. Add the last non-Heartbeat packet sending time in NiuxzSocket, and then make a judgment in the health check task.

The above is my experience in implementing the first distributed file system using synchronous socket. Some problems have actually become no problem in NIO. NIO and AIO are more suitable for servers with a large number of connections.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.