For details about how to deal with python TCP Socket sticking packages and subcontracting, pythonsocket

Source: Internet
Author: User

For details about how to deal with python TCP Socket sticking packages and subcontracting, pythonsocket
Overview

During TCP Socket development, both the packet sticking and packet Subcontracting must be handled. This document describes how to solve the problem in detail. The language is Python. In fact, it is very easy to solve this problem. At the application layer, define a Protocol: Message Header + message length + message body.

So What Are sticks and subcontracting?

About subcontracting and sticking packages

Stick package: the sender sends two strings "hello" + "world", but the receiver receives "helloworld" at a time ".

Subcontracting: the sender sends the string "helloworld", but the receiver receives two strings "hello" and "world ".

Although the socket environment has the above problems, TCP transmission of data can ensure the following:

  • The order remains unchanged. For example, if the sender sends hello messages and the receiver receives hello messages in sequence, This is the commitment of the TCP protocol. Therefore, this is the key to solving packet subcontracting and sticking problems.
  • The split package does not insert other data.

Therefore, to use socket communication, you must define a protocol. Currently, the most common protocol standard is: Message Header (Baotou) + message length + Message Body

Why is TCP subcontracting?

TCP sends data in segments. After a TCP link is established, there is a maximum message length (MSS ). If the application layer data packet exceeds MSS, the application layer data packet is split into two sections for sending. At this time, the application layer of the acceptor needs to splice the two TCP packets to process the data correctly.

Related, the router has a MTU (maximum transmission unit), generally 1500 bytes, except the IP header 20 bytes, leave TCP only MTU-20 bytes. So the tcp mss is generally MTU-20 = 1460 bytes.

When the application layer data exceeds 1460 bytes, TCP will send multiple data packets.

Additional reading

The RFC of TCP defines that the default value of MSS is 536, because RFC 791 says that any IP device must receive a minimum size of 576 (in fact, 576 is the MTU of the dial-up network, and 576 minus the 20 bytes of the IP header is 536 ).
Why does TCP stick packets?

Sometimes, TCP uses a Nagle algorithm to improve network utilization. This algorithm means that even if the sender has data to be sent, the sending will be delayed if there are few data to be sent. If the application layer transmits data to TCP quickly, the two application layer data packets will be "glued" together, and TCP sends only one TCP data packet to the receiving end.

Development Environment
  • Python version: 3.5.1
  • Operating System: Windows 10x64
Message Header (including message length)

The message header may not be a single byte, such as 0xAA or something. It may also contain Protocol version numbers, commands, and so on. Of course, the message length can also be merged into the message header, the only requirement is that the header length should be fixed, and the package body length should be variable. The following is a custom header:

Version (ver) Message length (bodySize) Command (cmd)

The version number, message length, and command data types are all unsigned 32-bit integer variables. Therefore, the message length is fixed to 4 × 3 = 12 bytes. Because there is no type definition in Python, the struct module is generally used to generate headers. Example:

import structimport jsonver = 1body = json.dumps(dict(hello="world"))print(body) # {"hello": "world"}cmd = 101header = [ver, body.__len__(), cmd]headPack = struct.pack("!3I", *header)print(headPack) # b'\x00\x00\x00\x01\x00\x00\x00\x12\x00\x00\x00e'
About using custom terminator to split data packets

Some people will want to use a custom terminator to split each data packet, so that the length of the data packet does not need to be specified or even the packet header is not required when the data packet is transmitted. However, if this is done, the network transmission performance will be greatly reduced, because if is used to determine whether the end character is used for each read byte. Therefore, we recommend that you select the message header + message length + message body.

In addition, when a user-defined Terminator is used, if this symbol is displayed in the message body, the subsequent data will be closed. At this time, the symbol escape must be processed, it is similar to the backslash of \ r \ n. Therefore, it is not recommended to use an terminator to split data packets.

Message Body

The message body data format can be Json, which is generally used to store data with unique information. In the following code, I use {"hello", "world"} data for testing. Use the json module in Python to generate json data

Python example

The Python code below demonstrates how to handle TCP Socket sticky packets and subcontracting. The core is to use a buffer dataBuffer received by a FIFO queue and a small while loop to determine.

The specific process is as follows: Put the data read from the socket after dataBuffer (into the queue), and then enter a small loop. If the content length of dataBuffer is smaller than the message length (bodySize ), the message is received in a small loop. If the length is greater than the message length, the system reads the packet header from the buffer zone and obtains the length of the packet body. Then, it determines whether the whole buffer zone is greater than the message header and message length, if the value is smaller than the value, the system jumps out of the small loop and continues to receive the message. If the value is greater than the value, the system reads the content of the package body, processes the data, and deletes the header and body of the message from dataBuffer ).

The following uses Markdown to draw a flowchart.

Server code
# Python Version: 3.5.1import socketimport structHOST = ''PORT = 1234 dataBuffer = bytes () headerSize = 12sn = 0def dataHandle (headPack, body ): global sn + = 1 print ("% s packets" % sn) print ("ver: % s, bodySize: % s, cmd: % s" % headPack) print (body. decode () print ("") if _ name _ = '_ main _': with socket. socket (socket. AF_INET, socket. SOCK_STREAM) as s: s. bind (HOST, PORT) s. listen (1) conn, addr = s. accept () With conn: print ('connectedby', addr) while True: data = conn. recv (1024) if data: # store data in the buffer, similar to push data dataBuffer + = data while True: if len (dataBuffer) Test the client code of the server.

The client code for testing the sticks and subcontracting is attached below

# Python Version: 3.5.1import socketimport timeimport structimport jsonhost = "localhost" port = 1234 ADDR = (host, port) if _ name _ = '_ main __': client = socket. socket () client. connect (ADDR) # normal data packet definition ver = 1 body = json. dumps (dict (hello = "world") print (body) cmd = 101 header = [ver, body. _ len _ (), cmd] headPack = struct. pack ("! 3I ", * header) sendData1 = headPack + body. encode () # subcontract data definition ver = 2 body = json. dumps (dict (hello = "world2") print (body) cmd = 102 header = [ver, body. _ len _ (), cmd] headPack = struct. pack ("! 3I ", * header) sendData2_1 = headPack + body [: 2]. encode () sendData2_2 = body [2:]. encode () # Stick package data definition ver = 3 body1 = json. dumps (dict (hello = "world3") print (body1) cmd = 103 header = [ver, body1. _ len _ (), cmd] headPack1 = struct. pack ("! 3I ", * header) ver = 4 body2 = json. dumps (dict (hello = "world4") print (body2) cmd = 104 header = [ver, body2. _ len _ (), cmd] headPack2 = struct. pack ("! 3I ", * header) sendData3 = headPack1 + body1.encode () + headPack2 + body2.encode () # normal data packet client. send (sendData1) time. sleep (3) # subcontract test client. send (sendData2_1) time. sleep (0.2) client. send (sendData2_2) time. sleep (3) # Stick the package to test the client. send (sendData3) time. sleep (3) client. close ()
The server prints the result.

The following is the test result. It can be seen that the receiver has handled the problem of sticking packets and subcontracting perfectly.

Connected by ('2017. 0.0.1 ', 23297) 1st packets ver: 1, bodySize: 18, cmd: 101 {"hello": "world"} packet (0 Byte) is smaller than the header length, the exclusive small loop packet (14 bytes) is incomplete (31 bytes in total). The exclusive small loop contains 2nd packets: ver: 2, bodySize: 19, cmd: 102 {"hello ": "world2"} the packet (0 Byte) is smaller than the packet header length. 3rd packets out of a small loop: ver: 3, bodySize: 19, cmd: 103 {"hello ": "world3"} 4th packets ver: 4, bodySize: 19, cmd: 104 {"hello": "world4 "}
Handling of sticky packets and subcontracting under the Framework

In fact, whether it is using a blocking or asynchronous socket development framework, the framework itself provides a method to receive data to developers. Generally, developers must override this method. The following is an example of how to process sticky packages and subcontracting in the Twidted development framework. Only the core program is used:

# Twiestedclass MyProtocol (Protocol): _ data_buffer = bytes () # code omitting def dataReceived (self, data): "" Called whenever data is wrongly ed. "self. _ data_buffer + = data headerSize = 12 while True: if len (self. _ data_buffer) Summary

The above is all the details about the handling of python TCP Socket sticky packets and subcontracting, and I hope to help you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.