Ing between Socket API and TCP State _ three-way handshake (Listen, accept, connect) _ four-way handshake close and TCP Delay confirmation (call the setsockopt function once and set tcp_qui

Source: Internet
Author: User

When learning the network basics, the transport layer protocols include TCP and UDP;

In Linux network programming, we use the socket API to implement network communication.

So:

How does the status of Socket API correspond to that of TCP? We can see through:

How to complete three handshakes and four waves in Socket System calls:

Sock_dgram, that is, the connect operation knowledge in UDP registers the IP address and port information of the other machine in the kernel, and does not establish a link, that is, no packet sending, close or send packets ).

Sock_stream corresponds to the following:

Connect completes three handshakes of TCP. After the client calls connect, the TCP protocol in the kernel completes the three handshakes of TCP;

The close operation completes four waves.

The Berkeley Socket API corresponding to the three-way handshake:

You can see that there are three APIs related to connection establishment: connect, listen, and accept. Connect is used on the client, and the other two are used on the server.

For TCP/IP protocol stack, tcp_in & tcp_out of the TCP layer also participates in this process. Here we only discuss what the APIS at the three application layers do.

(1) connect

A syn is sent. After receiving the SYN + ACK from the server, the connection is complete. The last Ack is sent by the protocol stack, which is completed by tcp_out.

(2) Listen

On the server side, an unfinished connection queue is prepared, and only the Socket Structure of syn_c is received;

A complete connection queue is also prepared, that is, the socket structure that receives the last Ack is saved.

(3) accept

When an application process calls accept, it checks the completed connection queue mentioned above. If there is a connection in the queue, the connection is returned;

If there is no, that is, empty, blocking will try to call it and wait for sleep;

If the nonblocking method is called, it will be returned directly. Generally, "ewouldblock" errno tells the caller that the connection queue is empty.

Note:

In the preceding relationship between the socket API and TCP State, when the client receives a response from the server, confirmation may be delayed.

After the client receives the data, it will block the request to the server for confirmation.

After receiving the data:

Call setsockopt (FD, ipproto_tcp, tcp_quickack, (INT []) {1}, sizeof (INT), and quickly confirm with the server.

How can we determine whether there is a connection creation request or a link closure request:

Link creation request:

1. Connect completes three handshakes. The read event is generated on the FD monitored by accept, indicating a new link request;

Link Close request:

1. Close will complete four waves. If one party closes sockfd, the other party will perceive a read event,


If 0 is returned when the read operation reads data, that is, 0 data records are read, indicating that there is a disconnect request. (Defined in the operating system)

TCP status and socket processing during link closure, and possible problems:

1. time_wait

Time_wait is the status of the Party that actively closes the TCP connection. The system will wait for 2msl (maximum segment lifetime) in the time_wait status before releasing the connection (port ). Usually about
Within 4 minutes.

Time_wait status:

1. Ensure that the connection is reliably closed, that is, to prevent the loss of the last ack.

2. Avoid socket obfuscation (the same port corresponds to multiple sockets ).

Here we will explain a concept: embodiment. After a connection is closed, another connection is established between the same IP address and port for a period of time. The next connection is called the embodiment of the previous connection. TCP does not initiate a new embodiment for connections in the time_wait status.

Why can it be used to avoid socket confusion?

One Party closed the connection request and the other Party's response was delayed (for example, due to network reasons). As a result, time_wait times out and the port is available again, we have created another socket link on this port. What should I do if the other party responds at this time? In fact, this has been processed at the TCP layer. Because of the TCP serial number, the kernel TCP layer will discard the packet and send the packet to the other party so that the other party can disable sockfd. Therefore, the application layer does not matter. That is, we use socket
You do not need to process the code.

Note ::

Time_wait refersThe operating system timer will wait for 2msl,The party that proactively closes sockfd does not block. (That is, the application will not be blocked when it is close ).

When the active party closes sockfd, the other party may not know the event. When the Peer (passive) writes data, that is, send, an error is generated, that is, errno is econnreset.

The reason why the server generates a large number of time_wait(Generally, we do not develop the server like this, but the Web server and other multi-client servers need to take the initiative to close the connection after completing a request. Otherwise, the handle may be insufficient, as a result, the service cannot be provided .)

There are a large number of active close operations on the server. You need to pay attention to when the program will be automatically closed (such as batch clearing idle Sockets for a long time ).

Generally, there are not many self-written servers that actively disconnect, unless we manage idle timeout. (TCP short link refers to the process where a client sends a request to the server and closes the link after receiving the response from the server ).

2. close_wait

Close_wait is generated when the TCP connection is passively closed,

IfAfter receiving the request to close the connection from the other endLocal (server side) does not close the corresponding socket will lead to local socket into this state.

(If the other party closes and does not receive the request to close the link, the following is abnormal)

According to the TCP state machine, when we receive fin, the TCP implements ack and enters the close_wait status. However, if we do not execute close (), we will not be able to migrate from close_wait to last_ack. There will be many close_wait connections in the system.

If there is a large number of close_wait, it indicates that the client has a large number of concurrent requests, and the server does not properly perceive the exit of the client, and does not close these sockets in time. (If not promptly processed, it will appearNo socket descriptor is available.).

Under normal circumstances ::

One Party disables sockfd, and the other party generates a read event. When the Recv data is returned, if the return value is 0, the peer end is closed. In this case, we should call close to close the corresponding sockfd.

Abnormal ::

One party closes sockfd, and the other party does not know. (For example, if the network is closed, the other party will not receive the sent data packet ). In this case, if the other party writes send or read Recv data on the corresponding sockfd.

When Recv is used, 0 is returned, indicating that the link is disconnected.

When sending, an error occurs, and errno is econnreset.


For persistent connection APIs, be careful with the following issues:

Sometimes, we provide services to customers through APIS. If the APIS you provide use TCP persistent connections, the TCP receiving timeout mechanism is also used (Apis generally provide interfaces for setting timeout, for example, setting so_rcvtimeo or this select using setsockopt ), then you may need to be careful with the following situation (this is called the "package" here, the application does not correctly match the response package with the request package ):
If a TCP-received request times out (for example, set to 3 seconds) and returns to the customer, the customer continues to use this link to send the second request, in this case, the latter may receive the response from the previous request (the previous response is only 3 seconds later). If the latter responds incorrectly, this may cause serious problems. If the network is unstable, or the backend processing is slow, and the timeout is serious, one of the request response packets may cause multiple requests to respond to the packet. For example, for a common online lottery activity, the first user has an iPad, and the second user has only one virtual item on the background, then the second user will be prompted for the iPad.

The simplest solution to this problem is:Once a request times out, the connection is disconnected and re-established. However, this solution is theoretically not rigorous. Consider the following situation:
1. The reason for the response timeout is that the response packet is wandering in the network (for example, a router crashes and other reasons, such packets wandering in the network, also known as the lost group );
2. After detecting timeout, the API will disconnect and re-establish the same IP address and port as the original connection (the new connection is the embodiment of the disconnected connection );
3. After the new connection is established, a new request is sent immediately, but then the lost response packet finds the way home and returns, at this time, the new connection may think of this package as the response of the second request (the TCP sequence of this package happens to be the TCP sequence expected by the new connection, which is possible, but it is basically impossible ).
Note:Under normal circumstances, TCP maintains the time_wait status 2msl time to avoid possible problems caused by the embodiment. However, in practice, we can adjust system parameters or use the so_linger option to directly close a connection, skip the time_wait status, or reuse ports, in this way, the embodiment may appear.In practical applications, the above situation will not happen, but theoretically it is possible.

After careful analysis, we will find that this problem is caused by "package transfer" on the surface, but the essential reason is that the program does not validate the Protocol package at the application layer. For example, if two clients a and B establish two connections with the server at the same time, if there is a bug on the server side, the server sends a response to the B connection by mistake, if there is no validation, a request will also receive B's response.The solution to this problem is to use a serial number verification mechanism at the application layer to ensure one-to-one correspondence between requests and responses.

 

Reprinted from: http://blog.163.com/xychenbaihu@yeah/blog/static/13222965520118139252103/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.