Iron Learning Python_day34_socket Module 2 and sticky-pack phenomenon sockets
Socket is a computer network data structure, which embodies the concept of "communication endpoint" in C/s structure.
Before any type of communication begins, a network application must create a socket.
They can be compared to telephone jacks, without which they will not be able to communicate.
Sockets are originally created for applications on the same host, allowing a program (aka one process) running on the host to communicate with another running program.
This is called interprocess communication (Inter process Communication, IPC).
There are two types of sockets: file-based and network-oriented.
Based on file Af_unix
Address family: UNIX (terminology)
Two processes run on the same computer, so these sockets are file-based, which means that the file system supports their underlying infrastructure.
This is obvious because the file system is a shared constant between multiple processes running on the same host.
Based on network Af_inet
Address family: Internet
Af_netlink (no connection)
Python 2.5 introduces support for special types of Linux sockets.
The Af_netlink family of sockets (no connections) allows IPC between user-level and kernel-level code using the standard BSD socket interface.
For example, add a new system call,/PROC support, or "IOCTL" to an operating system.
AF_TIPC (transparent interprocess communication)
Python 2.6 adds another feature for Linux that supports transparent interprocess communication (TIPC) protocols.
TIPC allows machines in a cluster of computers to communicate with each other without using IP-based addressing.
Socket Address: host-Port pair
A network address consists of a host name and a port number pair, which is required for network communication.
Connection-oriented sockets with non-connected sockets 1, connection-oriented sockets
means that you must establish a connection before communicating, for example, using a telephone system to call a friend.
This type of communication is also known as a virtual circuit or a stream socket.
Connection-oriented communication provides serialized, reliable, and non-repeatable data delivery without logging boundaries.
means that each message can be split into multiple fragments, and each message fragment is guaranteed to reach the destination,
They are then grouped together in order, and the complete message is finally delivered to the waiting application.
The primary protocol for implementing this type of connection is Transmission Control Protocol (TCP).
In order to create a TCP socket, you must use SOCK_STREAM as the socket type. (Staeam. Streams)
2. No connection socket
Datagram type, which is a non-connected socket. There is no need to establish a connection before communication begins.
It is not guaranteed to be sequential, reliable or repeatable during the data transfer process.
The datagram holds the record boundary, and the message is sent as a whole instead of being split into multiple fragments first.
Message transmission using datagrams can be likened to a postal service.
Letters and parcels may not arrive in the order in which they were sent. may not even arrive.
In order to add it to concurrent traffic, there may even be duplicate messages in the network.
The primary protocol that implements this type of connection is User Datagram Protocol (UDP).
In order to create a UDP socket, you must use SOCK_DGRAM as the socket type. (Datagram. Data report)
Socket () module function
要创建套接字,必須使用socket.socket()函数,它一般的语法如下:socket(socket_family, socket_type, protocol=0)其中,socket_family 是AF_UNIX或AF_INET, socket_type 是SOCK_STREAM 或SOCK_DGRAM, protocol通常省略,默认为0。创建TCP/= socket.socket(socket.AF_INET, socket.SOCK_STREAM)创建UDP/= socket.socket(socket.AF_INET, socket.SOCK_DGRAM)因为有很多socket模块属性,所以此时使用‘from module import *‘这种导入方式可以简便许多,使用‘from socket import *‘= socket(AF_INET, SOCK_STREAM)
Once a socket object is available, the method of using the socket object will allow for further interaction.
Sticky bag
After executing multiple commands, the result is likely to be only part of the results, and when other commands are executed, another part of the result is received.
This is like a sticky bag.
Note: Only TCP has sticky packets, and UDP never sticks.
(UDP is datagram-oriented and has a message boundary.) )
Example: Socket TCP SSH Remote execution command server#!/usr/bin/env python# _*_ Coding:utf-8 _*_ fromSocketImport *ImportSubprocess" "socket TCP SSH Remote execution command server" "HOST= ' localhost 'PORT= 9527ADDR=(HOST, PORT) Bufsiz= 1024x768Tcpss=Socket (af_inet, Sock_stream) tcpss.setsockopt (Sol_socket, SO_REUSEADDR,1) Tcpss.bind (ADDR) Tcpss.listen (5) while True: Conn, addr=Tcpss.accept ()Print(' Accept client connections ...: ', addr) while True:# receive the cmd command sent by the clientCmd=CONN.RECV (Bufsiz)# Exit Loop when command is empty if notCmd: Break " "The subprocess executes the external instruction through the child process, and the shell parameter is true when the command is passed directly;The Popen method is used to advance into an input environment and execute a series of instructions.stdin Standard input of the program, STDOUT standard output stderr standard error. " "Result=Subprocess. Popen (Cmd.decode (' Utf-8 '), Shell=True, stdout=Subprocess. PIPE, stdin=Subprocess. PIPE, stderr=Subprocess. PIPE) stderr=Result.stderr.read () stdout=Result.stdout.read ()# Send standard output and standard error to clientConn.send (stderr) conn.send (stdout) conn.close () Tcpss.close () Socket TCP SSH Remote execution command client#!/usr/bin/env python# _*_ Coding:utf-8 _*_ fromSocketImport *"' Socket TCP SSH Remote execution command client 'HOST= ' localhost 'PORT= 9527Bufsiz= 1024x768ADDR=(HOST, PORT) TCPSC=Socket (af_inet, sock_stream) tcpsc.Connect(ADDR) while True: cmd= input(' Please enter a command to operate on the server >>> '). Strip ()if notCmd: Break # Exit loop when client input quit ifCmd== ' Quit ': BreakTcpsc.send (Cmd.encode (' Utf-8 ')) result=TCPSC.RECV (Bufsiz)# Note that the server operating system is windows, so use GBK when decoding Print(Result.decode (' GBK '), End="') tcpsc.close () Normal command output short message when not sticky packet as follows: D:\Portablesoft\Python35\Python.exe E:/Python/Important Code/Socket-Remote execution commands/tcpsccmd.py Please enter a command to operate on the server>>>dirThe volume in drive E is the serial number of the VM volume is 4ae6-716D E:\Python\Important Code\Socket-Directory for remote execution of commands2018-05-10 -: - <DIR>.2018-05-10 -: - <DIR>..2018-05-10 -: - 532tcpsccmd.py2018-05-10 -: - 1,375tcpsscmd.py2A file1,907Bytes2List of -,977,878,016When the free byte command outputs a long message (1024x768Bytes are not shown at once) sticky packets are as follows: D:\Portablesoft\Python35\Python.exe E:/Python/Important Code/Socket-Remote execution commands/tcpsccmd.py Please enter a command to operate on the server>>> HelpFor more information about a command, type the help command name Assoc to display or modify the File name extension Association. ATTRIB Display or change file properties. Break set or clear an extended CTRL+C check. BCDEDIT set the properties in the startup database to control the boot load. CACLS displays or modifies the Access Control List (ACL) of the file. Call calls this one from another batch program. The CD displays the name of the current directory or changes it. CHCP Displays or sets the number of active code pages. CHDIR displays the name of the current directory or changes it. CHKDSK checks the disk and displays the status report. CHKNTFS Displays or modifies the startup time disk check. CLS clears the screen. CMD opens another Windows command interpreter window. Color sets the default console foreground and background color. COMP compares the contents of two or two sets of files. Compact displays or changes the compression of files on NTFS partitions. Convert converts a FAT volume to NTFS. You cannot convert the current drive. Copy copies at least one of the files to another location. Date Displays or sets dates. DEL deletes at least one file. DIR displays the files and subdirectories in a directory. DISKCOMP Please enter a command to operate on the server>>>At this point, continue to enter a short message command, the answer is not shown before the full message will continue to send over to receive: DISKCOMP Please enter the command to operate on the server>>>DATE compares the contents of two floppy disks. DISKCOPY Copy the contents of one floppy disk to another. DISKPART displays or configures disk partition properties. DOSKEY Edit the command line, invoke the Windows command, and create a macro. Driverquery Displays the current device driver status and properties. echo Displays the message, or turns the command back on or off. Endlocal to end localization of environment changes in batch files. ERASE Delete one or more files. Exit CMD. EXE Program (command interpreter). FC compares two files or two sets of files and displays the differences between them. Find searches one or more files for a text string. FINDSTR searches for strings in multiple files. For each file in a set of files, run a specified command. Format formats the disk for use with Windows. FSUTIL displays or configures the properties of the file system. FTYPE displays or modifies the file types that are associated with the file name extension. GOTO points The Windows command interpreter to a labeled row in the batch program. GPRESULT Displays Group Policy information for the machine or user. GRAFTABL enable Windows to display the extended character set in graphics mode. Help provides information about Windows commands. ICACLS Please enter a command to operate on the server>>>----------------------
Unpacking mechanism of packet-forming TCP protocol
When the length of the sending buffer is greater than the MTU of the NIC, TCP splits the sent data into a few data packets sent out.
The MTU is an abbreviation for the maximum transmission unit. It means the largest packet transmitted over the network. The unit of the MTU is bytes.
Most network devices have an MTU of 1500.
If the MTU of this machine is larger than the MTU of the Gateway,
Large packets are removed and sent, resulting in a lot of packet fragmentation, increased packet loss, and reduced network speed.
Flow-oriented communication features and Nagle algorithm
TCP (Transport Control Protocol, transmission Protocol) is connection-oriented, stream-oriented and provides high reliability services.
Both sides of the transceiver (client and server side) have one by one pairs of sockets,
Therefore, the sending side in order to send multiple packets to the receiving end of the package, more efficient to send to each other,
Using the Optimization method (Nagle algorithm), the data with small interval and smaller data is combined into a large data block and then the packet is marshaled.
In this way, the receiving end, it is difficult to distinguish out, must provide a scientific unpacking mechanism.
Stream-oriented communication is a non-message-protected boundary.
For empty messages:
TCP is based on data flow, so send and receive messages can not be empty, which requires both the client and the server to add a null message processing mechanism,
Prevents the program from sticking, and UDP is datagram-based, and even if you enter empty content (direct carriage return), it can be sent,
The UDP protocol will help you encapsulate the message hair sent over.
Reliable Packet TCP protocol: TCP protocol data will not be lost, no packet received, the next receive, will continue to receive the last time, the end will always receive an ACK to clear the buffer content. The data is reliable, but it will stick to the package.
The cause of sticky packet phenomenon based on TCP protocol characteristics
The sender can be a K-K to send the data, and the receiving side of the application can be two K two k to move the data,
Of course, it is possible to mention 3 K or 6K data at a time, or just a few bytes of data at a time.
In other words, the data that the application sees is a whole, or a stream (stream),
The number of bytes of a message is not visible to the application, so the TCP protocol is a stream-oriented protocol, which is also the cause of the sticky packet problem.
UDP is a message-oriented protocol, and each UDP segment is a message.
The application must extract data in the message unit and not fetch any byte of data at a time, which is very different from TCP.
How do you define a message? Can think of the other one-time Write/send data for a message,
It is necessary to understand that when the other party send a message, no matter how the underlying fragment fragmentation,
The TCP protocol layer renders the data segment that makes up the entire message before it is rendered in the kernel buffer.
The TCP-based socket client uploads the file to the server, sending the file content in a paragraph of byte stream,
At the receiving side, it is not known where the byte stream of the file starts and where it ends.
In addition, the sticky packets caused by the sender are caused by the TCP protocol itself,
TCP to improve transmission efficiency, the sender often collects enough data before sending a TCP segment.
If there are few data to send in a few consecutive times, TCP will typically send this data to a TCP segment based on the optimization algorithm,
The receiving party receives the sticky packet data.
UDP does not occur sticky packets
UDP (User Datagram Protocol, Subscriber Datagram Protocol) is non-connected, message-oriented, providing efficient service.
The block merge optimization algorithm is not used, because UDP supports a one-to-many pattern,
So the receiver's skbuff (socket buffer) uses a chain structure to record each incoming UDP packet,
In each UDP packet there is a message header (message source address, port and other information),
Thus, for the receiver, it is easy to distinguish between processing.
That is, message-oriented communication is a message-protected boundary.
For empty messages:
TCP is based on data flow, so messages sent and received cannot be empty.
This requires the processing of empty messages on both the client and server side to prevent the program from sticking,
And UDP is based on the datagram, even if you enter the empty content (direct carriage return), can also be sent,
The UDP protocol will help you encapsulate the message hair sent over.
Unreliable non-sticky UDP protocol:
The recvfrom of UDP is blocked, and a recvfrom (x) must be unique to a sendinto (y),
When the X-byte data is finished, if the y;x data is lost, it means that UDP does not stick at all, but it loses data and is unreliable.
When sending using the UDP protocol, the maximum data length with the SendTo function is: 65535-ip header (20) –UDP Header (8) = 65507 bytes.
When sending data with the SendTo function, the function returns an error if the sending data is longer than the value. (Discard this package, do not send)
When sending with the TCP protocol, because TCP is a data flow protocol, there is no restriction on packet size (regardless of the size of the buffer).
This means that the data length parameter is not restricted when using the Send function.
In fact, the specified data is not necessarily sent out at one time, if the data is longer, will be sent in fragments,
If it is relatively short, it may wait for the next data to be sent together.
There are two cases where a sticky packet occurs. The caching mechanism of a sender
The sending side needs to wait until the buffer is full to send out, resulting in sticky packets
(Send data time interval is very short, the data is very small, to join together, to produce sticky packets).
Scenario two receiver's caching mechanism
The receiver does not receive the buffer packet in time, resulting in multiple packets receiving
(the client sends a piece of data, the service side only a small portion, the service end of the next time, or from the buffer to take the last remaining data, resulting in sticky packets)
Summarize
Sticky packets occur only in the TCP protocol:
1. On the surface, the sticky packet problem is mainly due to the caching mechanism of the sender and receiver, and the characteristic of the TCP protocol oriented stream communication.
2. In fact, the main reason is that the receiver does not know the boundary between the messages and does not know how many bytes of data are fetched at once.
Solution Solution for Adhesive Packaging
The root of the problem is that the receiver does not know the length of the byte stream that the sender will transmit,
So the way to solve the sticky packet is around, how to let the sending side before sending the data, the total size of the bytes will be sent to the receiving end to know,
Then the receiving end comes to a dead loop to receive all the data.
Problems that exist:
The program is running much faster than the network transmission speed, so before sending a byte, send the word throttle length with send first,
This approach amplifies the performance loss caused by network latency.
Solution Two (Advanced)
We can use the struct module, which converts the length of data to be sent to a fixed-length byte.
This allows the client to take the fixed-length byte content before receiving the message each time to look at the size of the information to be received next,
Then the final accepted data will stop if it reaches this value, and it will be able to receive the complete data in just a few more.
struct MODULE
The module can turn a type, such as a number, into a fixed-length bytes.
>>> struct.pack(‘i‘,1111111111111‘i‘format-2147483648<=<=2147483647#这个是范围
Using structs to solve sticky bags
With the struct module, we know that the length number can be converted to a standard size 4-byte number.
This feature can therefore be used to pre-send data lengths.
Send a struct converted data length 4 bytes before sending the data.
Receive 4 bytes First, use a struct to convert to a number to get the length of data to receive and then receive the data by length
We can also make the header into a dictionary that contains the details of the real data that will be sent,
Then JSON serialization, using struck to package the serialized data length into 4 bytes (4 is sufficient)
Send the header length before encoding the header content and then send the last send the real content.
Receive the header length first, with a struct out to receive the header according to the length of the fetch, and then decode, deserialize
Extracts the details of the data to be fetched from the deserialized result, and finally takes the actual data content.
Example: transferring large files, solving sticky packs#!/usr/bin/env python# _*_ Coding:utf-8 _*_" "socket struct transfers large file to resolve packet TCP service side" "ImportOsImportJsonImportSocketImportStructserver=Socket.socket () Server.bind (' 127.0.0.1 ',9527)) Server.listen () conn, addr=Server.accept ()# file path, filename, file size to transferFilePath= R ' E:\Python\file\ three-body. txt 'FileName=Os.path.basename (filepath) filesize=Os.path.getsize (filepath) dic={' filename ': filename,' FileSize ': FileSize}# dictionary json and transcoding bytes bytesStr_dic=Json.dumps (DIC). Encode (' Utf-8 ')# Calculate the byte length of the JSON and pin it to the four-bit mode of the struct moduleLen_dic=Struct.pack (' I ',Len(Str_dic)) Conn.send (Len_dic)# The length of the JSON sentConn.send (Str_dic)# Send JSON with Open(FilePath,' RB ') asF: whileFilesize:content=F.read (4096) conn.send (content) filesize-= Len(content) Conn.close () Server.close () client:#!/usr/bin/env python# _*_ Coding:utf-8 _*_" "socket struct Transfers large file solution Sticky packet TCP client" "ImportJsonImportstructImportSocketclient=Socket.socket () client.Connect((' 127.0.0.1 ',9527)) Dic_len=CLIENT.RECV (4) Dic_len=Struct.unpack (' I ', Dic_len) [0]dic=CLIENT.RECV (Dic_len) str_dic=Dic.decode (' Utf-8 ') dic=Json.loads (Str_dic) with Open(dic[' filename '],' WB ') asF: whiledic[' FileSize ']: Content=CLIENT.RESV (4096) dic[' FileSize ']-= Len(content) f.write (content) Client.close ()
How to Socket
Service-Side socket functions
S.bind () binding (host, port number) to socket
S.listen () Start TCP listener
S.accept () passively accepts a TCP client connection, (blocking) waits for a connection to arrive
Client socket functions
S.connect () Active initialization of TCP server connections
Extended version of the S.CONNECT_EX () connect () function, which returns an error code instead of throwing an exception when an error occurs
Socket functions for public use
S.RECV () Receiving TCP data
S.send () Sending TCP data
S.sendall () Sending TCP data
S.recvfrom () receiving UDP data
S.sendto () Send UDP data
S.getpeername () The address of the remote that is connected to the current socket
S.getsockname () address of the current socket
S.getsockopt () returns the parameters of the specified socket
S.setsockopt () Sets the parameters of the specified socket
S.close () Close socket
Lock-oriented socket method
S.setblocking () sets the blocking and non-blocking mode for sockets
S.settimeout () Sets the timeout period for blocking socket operations
S.gettimeout () Gets the timeout period for blocking socket operations
Functions for file-oriented sockets
S.fileno () The file descriptor of the socket
S.makefile () Create a file associated with the socket
The official documentation for Socket.send () and Socket.sendall () under the socket module is explained below:
Socket.send (string[, flags])
Send data to the socket. The socket must is connected to a remote socket. The optional Flags argument have the same meaning as for recv () above. Returns the number of bytes sent. Applications is responsible for checking, all data has been sent; If only some of the data is transmitted, the application needs to attempt delivery of the remaining data.
The return value of Send () is the number of bytes sent, which may be less than the number of bytes to be sent, which means that all data in the string may not be sent. An exception is thrown if there is an error.
Socket.sendall (string[, flags])
Send data to the socket. The socket must is connected to a remote socket. The optional Flags argument have the same meaning as for recv () above. Unlike send (), this method continues to send the data from string until either all data have been sent or an error occurs. None is returned on success. On error, a exception is raised, and there is the no-determine how much data, if any, was successfully sent.
Attempt to send all data of string, success returns none, Failure throws an exception.
End
2018-5-10
Reference:
http://www.cnblogs.com/Eva-J/
The fourth edition of Python core programming
Iron Learning Python_day34_socket Module 2 and sticky pack phenomenon