Socket
Socket can complete the development of C/s architecture software.
Note A complete computer system is composed of hardware, operating system, application software three, with these three conditions, a computer can work. But to play with others, it is necessary to go to the Internet (the essence of the Internet is a series of network protocols).
The core of the Internet is made up of a stack of protocols, which are standards, such as the standard of communication in the world is English. All computers have learned the Internet Protocol, and all the computers can send and receive information according to the unified standard to complete the communication.
Features of the Internet Protocol: A standard that defines how computers connect to the Internet, and computers that connect to the Internet.
OSI Seven layer model:
- Application Layer
- Presentation Layer
- Session Layer
- Transport Layer
- Network layer
- Data Link Layer
- Physical Layer
Physical Layer Features:
Mainly based on electrical characteristics to send high and low voltage (electrical signal), the highest voltage corresponding to the number 1, low voltage corresponding to the number 0
Features of the data link layer:
Defines the way in which electrical signals are grouped
Ethernet protocol:
In the early days each company had its own grouping, and later formed a unified standard, the Ethernet protocol Ethernet
Ethernet Regulations
A set of electrical signals constitute a packet called ' frame ' each data frame is divided into: header head and data two parts head? Data
Head contains: (fixed 18 bytes)
Sender/Source Address, 6-byte receiver/destination address, 6-byte data type, 6-byte-containing: (shortest 46 bytes, maximum 1500 bytes)
Details of the packet head length +data length = Shortest 64 bytes, up to 1518 bytes, over maximum limit on shard send
MAC Address:
The source and destination addresses included in the head are: Ethernet specifies that devices that access the Internet must have a network card, and the address of the sending and receiving ends refers to the address of the network card, which is the MAC address
MAC Address: Each NIC is fired from the factory, the world's only MAC address, the length of 48-bit 2 binary, usually represented by 12-bit 16 binary number (the first six bits is the manufacturer number, the last six bits is the pipeline number)
Broadcasting:
With a MAC address, two hosts within the same network can communicate (one host obtains the MAC address of another host via the ARP protocol)
Ethernet uses the most primitive way, the broadcast way carries on the communication, namely the computer communication basic roar
Network layer
The world-wide internet is made up of small LANs that are isolated from each other (a local area network = a broadcast domain, and cross-broadcast domains are routed through routes)
Network layer function: Introduce a new set of address to distinguish different broadcast domain/subnet, this set of address is the network address
IP protocol:
The protocol that specifies the network address is called the IP protocol, the address it defines is called an IP address, and the widely used version of V4 is IPv4, which specifies that the network address is represented by 32-bit 2 binary
Range 0.0.0.0-255.255.255.255
An IP address is usually written as a four-segment decimal number, as an example: 172.16.10.1 IP address is divided into two parts
- Network section: Identifying subnets
- Host part: Identity host
Note: The simple IP address segment only identifies the type of IP address, from the network part or the host part can not identify the subnet in which an IP is located
Example: 172.16.10.1 and 172.16.10.2 are not sure that they are in the same subnet
Subnet mask
The so-called "subnet mask" is a parameter that represents the characteristics of a sub-network. It is formally equivalent to an IP address, is also a 32-bit binary number, its network portion is all 1, the host part is all 0. For example, IP address 172.16.10.1, if the network portion is known as the first 24 bits, the host part is the last 8 bits, then the subnet mask is 11111111.11111111.11111111.00000000, written in decimal is 255.255.255.0.
Knowing the "subnet mask", we can determine whether any two IP addresses are in the same sub-network. The method is to use the two IP address and the subnet mask for each and operation (two digits are 1, the result of the operation is 1, otherwise 0), and then compare the results are the same, if so, it indicates that they are in the same sub-network, otherwise it is not.
For example, the subnet masks for known IP addresses 172.16.10.1 and 172.16.10.2 are 255.255.255.0, are they on the same subnet? Both and the subnet mask, respectively, and operation,
172.16.10.1:10101100.00010000.00001010.000000001
255255.255.255.0:11111111.11111111.11111111.00000000
and operation network Address result: 10101100.00010000.00001010.000000001->172.16.10.0
172.16.10.2:10101100.00010000.00001010.000000010
255255.255.255.0:11111111.11111111.11111111.00000000
and operation network Address result: 10101100.00010000.00001010.000000001->172.16.10.0
The results are 172.16.10.0, so they are on the same subnet.
To summarize, the IP protocol has two main functions, one is to assign an IP address to each computer, and the other is to determine which addresses are in the same subnet.
IP packets
IP packets are also divided into head and data sections, without having to define a separate field for the IP packet, directly into the data portion of the Ethernet packet
Head: 20 to 60 bytes in length
Data: The maximum is 65,515 bytes.
In the "Data" section of the Ethernet packet, the maximum is only 1500 bytes. Therefore, if the IP packet exceeds 1500 bytes, it needs to be split into several Ethernet packets, which are sent by sub-development.
Ethernet header? IP header? IP data
ARP protocol
ARP protocol origin: Computer communication is basically roar, that is, the way of broadcasting, all the upper layer of the package to the end of the packet to be encapsulated on the Ethernet header, and then sent through the Ethernet protocol, when talking about the Ethernet protocol, we know that
Communication is based on the implementation of the MAC broadcast, the computer in the contract, the acquisition of its own Mac is easy, how to obtain the target host Mac, you need to pass the ARP protocol
ARP protocol feature: Send packets in a broadcast way, get the MAC address of the destination host
How the protocol works: Each host IP is known
Example: Host 172.16.10.10/24 access 172.16.10.11/24
One: First differentiate your subnet by IP address and subnet mask
Scene? Packet Address
Same subnet \ Destination host Mac, destination host IP
Different subnet gateway Mac, destination host IP
Two: Analysis 172.16.10.10/24 and 172.16.10.11/24 are in the same network (if not the same network, then the target IP in the following table is 172.16.10.1, the Mac that gets the gateway via ARP)
源mac目标mac源ip目标ip数据部分
Send-side host Sender mac FF:FF:FF:FF:FF:FF 172.16.10.10/24 172.16.10.11/24 data
Three: This package will be broadcast in the sending side of the network in the transmission, all hosts received after unpacking, found that the target IP for their own, on the response, back to their Mac
Socket () module function usage
1 Import Socket 2 Socket.socket (socket_family,socket_type,protocal=0) 3 socket_family can be Af_unix or af_inet. Socket_type can be sock_stream or sock_dgram. Protocol is generally not filled, the default value is 0. 4 5 Gets the TCP/IP socket 6 tcpsock = Socket.socket (socket.af_inet, socket. SOCK_STREAM) 7 8 Get UDP/IP socket 9 Udpsock = Socket.socket (socket.af_inet, socket. SOCK_DGRAM) 10 11 because there are too many properties in the socket module. We made an exception here by using the ' from module import * ' statement. With the ' from socket import * ', we take all the attributes in the socket module into our namespace, which can greatly reduce our code. 12 for example Tcpsock = socket (af_inet, SOCK_STREAM)
service-side socket functions
S.bind () binding (host, port number) to socket
S.listen () Start TCP listener
S.accept () passively accepts a TCP client connection, (blocking) waits for a connection to arrive
Client Socket Functions
S.connect () Active initialization of TCP server connections
Extended version of the S.CONNECT_EX () connect () function, which returns an error code instead of throwing an exception when an error occurs
socket functions for public use
S.RECV () Receiving TCP data
S.send () sends TCP data (send data is lost when the amount of data to be sent is greater than the remaining space in the cache)
S.sendall () sends the full TCP data (essentially a cyclic call Send,sendall the data is not lost when the amount of data to be sent is greater than the remaining space in the buffer, and the call to send is sent until it is finished)
S.recvfrom () receiving UDP data
S.sendto () Send UDP data
S.getpeername () The address of the remote that is connected to the current socket
S.getsockname () address of the current socket
S.getsockopt () returns the parameters of the specified socket
S.setsockopt () Sets the parameters of the specified socket
S.close () Close socket
lock-oriented socket method
S.setblocking () sets the blocking and non-blocking mode for sockets
S.settimeout () Sets the timeout period for blocking socket operations
S.gettimeout () Gets the timeout period for blocking socket operations
functions for file-oriented sockets
S.fileno () The file descriptor of the socket
S.makefile () Create a file associated with the socket
Service side:
#创建一个连接 Socket.socket (Af_inet,sock_stream)
TCP-based sockets
TCP is link-based, you must start the server, and then start the client to link the server
TCP Service Side
1 SS = socket () #创建服务器套接字2 ss.bind () #把地址绑定到套接字3 ss.listen () #监听链接4 inf_loop: #服务器无限循环5 cs = ss.accept ( ) #接受客户端链接6 comm_loop: #通讯循环7 cs.recv ()/cs.send () #对话 (Receive and send) 8 cs.close () #关闭客户端套接字9 ss.close ( ) #关闭服务器套接字 (optional)
TCP Client
1 CS = socket () # Create client Socket 2 Cs.connect () # Try to connect to server 3 Comm_loop: # Communication Loop 4 cs.send ()/cs.recv () # Dialog (Send/Receive) 5 Cs.close () # Close Client sockets
Sticky Bag
Note: Only TCP has sticky packet phenomenon, UDP will never sticky, why, and listen to my word
First you need to master the principle of a Socket transceiver message
The sender can be a K-K to send the data, and the receiving side of the application can be two K two k to take the data, of course, it is possible to take 3 K or 6K data at a time, or only a few bytes of data at a time, that is, the application sees the data is a whole, or a stream (stream), The number of bytes of a message is not visible to the application, so the TCP protocol is a stream-oriented protocol, which is also the cause of the sticky packet problem. And UDP is a message-oriented protocol, each UDP segment is a message, the application must be in the message to extract data, not one time to extract arbitrary bytes of data, which is very different from TCP. How do you define a message? Can think of the other one-time Write/send data for a message, it is necessary to understand that when the other side send a message, regardless of the underlying how fragmented shards, the TCP protocol layer will constitute the entire message of the data segment is completed before rendering in the kernel buffer.
For example, the TCP-based socket client to the server to upload files, the content of the file is sent in accordance with a paragraph of the stream of bytes sent, in the receiver looked at, do not know where the file's byte stream from where to start, where to end
The so-called sticky packet problem is mainly because the receiver does not know the boundary between the message, do not know how many bytes of data extracted at once.
In addition, the packet caused by the sender is caused by the TCP protocol itself, TCP to improve transmission efficiency, the sender often to collect enough data before sending a TCP segment. If there are few data to send in a few consecutive times, TCP will usually send the data to a TCP segment based on the optimization algorithm, and the receiver receives the sticky packet data.
- TCP (Transport Control Protocol, transmission Protocol) is connection-oriented, stream-oriented and provides high reliability services. Both ends of the transceiver (client and server side) have one by one pairs of sockets, so the sending side in order to send multiple packets to the receiver, more efficient to the other side, the use of the optimization method (Nagle algorithm), the multiple interval small and small data volume data, combined into a large block of data, and then to the packet. In this way, the receiving end, it is difficult to distinguish out, must provide a scientific unpacking mechanism. That is, stream-oriented communication is a non-message-protected boundary.
- UDP (User Datagram Protocol, Subscriber Datagram Protocol) is non-connected, message-oriented, providing efficient service. The Block merging optimization algorithm is not used, because UDP supports a one-to-many pattern, so the receiver Skbuff (socket buffer) uses a chain structure to record each incoming UDP packet, in each UDP packet there is a message header (message source address, port and other information), so for the receiving end , it is easy to distinguish between the processing. that is, message-oriented communication is a message-protected boundary.
- TCP is based on data flow, so send and receive messages can not be empty, which requires the client and the server to add a null message processing mechanism to prevent the program stuck, and UDP is based on the datagram, even if you enter the empty content (direct carriage), it is not an empty message, the UDP protocol will help you encapsulate the message header, the experiment slightly
UDP Recvfrom is blocked, a recvfrom (x) must be the only one sendinto (y), after the X-byte data is completed, if the y>x data is lost, which means that UDP is not sticky packets, but will lose data, unreliable
TCP protocol data is not lost, no packets are received, the next time it is received, it continues to receive the last time, and the buffer content is always cleared when the ACK is received by the client. The data is reliable, but it will stick to the package.
In both cases, a sticky bag will occur.
The sending side need to wait for the buffer full to send out, resulting in sticky packets (send data time interval is very short, the data is very small, to join together to produce sticky packets)
The receiver does not receive the buffer in time packets, resulting in multiple packets received (the client sent a piece of data, the server only received a small portion of the service end of the next time to collect the last data from the buffer, resulting in sticky packets)
What happens when a package is split
When the length of the sending buffer is greater than the MTU of the NIC, TCP splits the sent data into a few data packets sent out.
Supplementary question one: Why TCP is reliable transmission, UDP is unreliable transmission
TCP-based data transfer please refer to my other article Http://www.cnblogs.com/linhaifeng/articles/5937962.html,tcp at the time of data transmission, the sending end sends the data to its own cache first, Then the Protocol control sends the data in the cache to the peer, returns a ack=1 to the end, the sender cleans up the data in the cache, returns ack=0 to the end, and then sends the data back, so TCP is reliable
While UDP sends data, the peer does not return a confirmation message, and therefore unreliable
Supplementary Question II: send (Byte stream) and recv (1024) and Sendall
The 1024 specified in recv means that 1024 bytes of data are taken out of the cache at a time.
The byte stream of send is put into the cache first, then the cache content is sent to the peer by the Protocol control, if the byte stream to be sent is larger than the buffer space, then the data is lost, and the data will be called by Sendall.
The socket's connect and accept implementation details. What is a socket? The socket is an abstraction layer between the application layer and the transport layer, which abstracts the complex operations of the TCP/IP layer into a few simple interface supply-level calls, meaning you do not have to deal with TCP protocol details and send, recv implementation details. Simply put, the socket is the problem of using the server and the client to resolve the inter-process communication connection. Understanding of Socket Connect: The friendly socket Layer shields the programmer of a lot of details of the protocol, such as three handshakes, so which stage does the TCP three handshake begin with? The flowchart below is very clear that the client side has established three handshakes when calling connect (). In addition we need to note two queues, one SYN queue, one accept queue. socket Accept implementation: After the user connects to the server will choose which socket with the server to transfer data, the new client and which socket to establish a connection? First make it clear that accept () takes the client's SYN request out of the accept queue, and then completes the three handshake, and the socket server creates a new SOCKETFD for each client. The data is then completed via SOCKETFD and client. For the TCP/IP protocol stack is the maintainer of a receive and send buffer. After receiving a packet from the client, the server-side TCP/IP stack does the following: if it receives a packet that requests a link, it passes the SOCKETFD socket to the Listener connection request port and, if it is a packet of the client that has already established the link, puts the data in the accept buffer. Thus, when the server side needs to read the specified client's data, you can use the socketfd_new socket through the recv or the read function into the buffer to fetch the specified data (because the socketfd_new represents the socket object that records the client IP and port, identified by this). How the packet finds the corresponding socket, this method is also embodied in the Linux kernel code. Static inline struct sock *__inet_lookup (struct net *net,struct inet_hashinfo *hashinfo,const __be32 saddr, const __BE16 s Port,const __be32 daddr, const __BE16 Dport,const int dif) {u16 hnum = Ntohs (dport);/* First try to find the socket */struct Soc that is successfully connectedK *sk = __inet_lookup_established (NET, hashinfo,saddr, sport, DADDR, Hnum, DIF);/* If you do not find a socket that is successfully connected, Then go to the socket in the listen state to find */return SK? : __inet_lookup_listener (NET, Hashinfo, daddr, Hnum, dif);} Summarize the process: Server side After calling listen, the kernel establishes two queues, the SYN queue, and the accept queue, where the length of the accept queue is specified by the backlog. After the server has called accept, it blocks, waits for an element of the accept queue, and after the client calls connect, it begins to initiate a SYN request and requests a connection to the server, which becomes the first handshake. After receiving the SYN request, the server puts the requester into the SYN queue and replies a confirmation frame ACK to the client, which also carries a request flag that requests a connection to the client's resume, which is SYN, which becomes the second handshake. After the client receives the Syn+ack frame, connect returns and sends an acknowledgment to establish the connection frame ack to the server side. This is called the third handshake. After the server receives an ACK frame, it moves the requester out of the SYN queue, prevents the accept queue, and the Accept () function waits for its own resource, wakes from the block, pulls the requester from the accept queue, resumes a new SOCKFD, and returns. This is the workflow and principle of the three functions of listen,accept,connect. As you can see from this process, there are two handshakes in the Connect function.
Python--8, Socket network programming