"TCP/IP Detailed Volume 2: implementation" Note--tcp input

Source: Internet
Author: User
Tags socket error

When the Protocol field of the received datagram indicates that this is a TCP segment, IPINTR (Pr_input function in the Protocol Protocol Conversion table) calls Tcp_input

For processing, Tcp_inut is performed at the software interrupt level.

The function is very long and we will have two discussions, listing the processing framework in Tcp_input. This chapter will end the processing of the RST message segment, and the next chapter begins

Describes the processing of ACK message segments.

The first few steps are very typical: validation of input segments (inspection and, length, etc.), and finding the connected PCB. Even though there's a lot more behind

Code, but with the "first prediction", the algorithm is likely to skip the subsequent logic completely. The first prediction algorithm is based on the assumption that, in general, the message segment

is not lost, the order is not wrong, so for a given connection, TCP can always guess the content of the next receiving segment. If the algorithm works, the function directly

Returns, which is the fastest execution path in tcp_input.


1. Preprocessing

This section describes the TCP message segments received for preprocessing. The approximate flow of processing is as follows:

1. Obtain the IP and TCP headers from the first mbuf.

2. Verify the test and the TCP.

3. Verify the TCP offset field.

4. Place the IP and TCP headers and options into the first mbuf.

5. Quickly process timestamp options.

6. Save the input flag and convert the field to the host byte order.

7. Look for the Internet PCB.

8. If the PCB is not found, discard the message and send the RST as a response.

9. If the TCP control block exists, but the connection status is closed, the socket is created and the local address and local port number are obtained, but the Connect or

Listen The message segment is discarded and no response is sent.

10. Do not change the notification window size.

11. If the socket debug option is selected, the connection status and IP and TCP headers are saved.

12. If the listening socket receives a message segment, create a new jack.

13. Calculate the window scaling factor.

14. Reset the idle time and the keepalive timer.

15. If you are not in the listening state, process the TCP option.


2. First forecast

The first prediction algorithm simplifies the realization of unidirectional data transmission by dealing with two common phenomena.

1. If TCP sends data, the next message segment waiting to be received on the connection is an ACK to the sent data.

2. If TCP receives data, the data message segment arrives sequentially when the next segment of the packet is waiting to be connected.


3.TCP input: Slow execution path

The following describes the processing code for the first prediction failure, a slower execution path in the tcp_input.

1. Discard the IP and TCP headers, including the TCP option.

2. Calculate the Receive window.

Because the code behind the function must determine how much data can be placed in the advertisement window, you must now calculate the size of the advertisement window. Receive that falls outside the notification window

Data is discarded, data that falls on the left side of the window is data that has been received and confirmed, and data sent to the end of the data that falls on the right side of the window is temporarily disallowed.


4. Complete passive open or active open

If the connection state equals listen or syn_sent, the processing of this section is performed. When the connection is in both states, the message segment waiting to be received is SYN, and any

Other messages will be discarded.

4.1. When the passive open connection state equals listen, the following processing is performed: 1. Discard rst, ACK, or non-syn. 2. If it is a broadcast message segment or a multi-broadcast segment, discard it. 3. Assign mbuf to the client's IP address and port number. 4. Set the local address in the PCB. 5. Populate the peer address in the PCB. 6. Assign and initialize the IP and TCP header templates. 7. Handle all TCP options. 8. Initialize ISS. 9. Initialize the ordinal variable in the control block. 10. Confirm the SYN and update the status.
4.2. Complete the active open 1. Verify the Received ACK2. Process and discard the RST message segment. 3. Determine if the Received SYN flag is placed. 4. Process the ACK. 5. Close the connection setup timer. 6. Initialize the receive sequence number. 7. Connection Setup. 8. View window size options. 9. Submit the data in the queue to the application process. 10. Update the RTT estimator value. 11. Processing is turned on at the same time. 12. Discard the data that falls outside the receiving window. 13. Force the window variable to be updated.
5.PAWS: Prevents the ordinal wrapping of the next sequence of possible occurrences. 1. The Basic Paws Test Paws algorithm is based on the assumption that for high-speed connections, the 32bit timestamp value wraps around much less quickly than the 32bit sequence number wrapping. Even the highest missing counter update frequency (plus 1 per millisecond), the timestamp sign bit also takes 24 days to wrap around. In the gigabit network, the sequence number may be wrapped around once in 17 seconds. Therefore, if the segment timestamp is less than the last timestamp received from the same connection, the description is a repeating segment and should be discarded (and subsequent timestamp expiration tests are required). Although the tcp_input can be discarded because the serial number is obsolete, the PAWS algorithm is able to efficiently handle high-speed networks with a high sequence number wrapping rate. Note that the PAWS algorithm is symmetric: it discards not only the repeating data segment, but also the duplicate ACK. The PAWS handles all incoming message segments. 2. Check for expired timestamps. 3. Discard duplicate message segments
6. Clip the message segment so that the data in the window This section discusses how to adjust the received message segment, to ensure that it only carries data that can be placed in the receiving window: Discard the duplicate data at the beginning of the receiving segment. Discards data beyond the receiving window from the end of the packet segment. This leaves the new data that can be placed in the receiving window to determine if there is duplicate data at the beginning of the message segment. 1. See if there is duplicate data in the front of the message segment. 2. Discard duplicate syn.3. Determine whether the data in the segment is completely duplicated. 4. Judge repeat fin. 5. Generate a duplicate ACK. 6. Handle simultaneous open or half connection. 7. Update the statistical values when you receive a partial repeating segment. 8. Delete duplicate data and update the emergency pointer. 9. Calculate the number of bytes that fall to the right of the notification window. 10. If the connection is in the TIME_WAIT state, see if there are any new connection requests. 11. Determine whether the window is to explore the text section. 12. Discard any other message segments that fall completely outside the window. 13. Process the segment of the message that carries some valid data.
7. Self-connect and open the application process simultaneously create a socket and establish a self-connection through the following system calls: Socket,bind to a local port, then connect tries to connect to the same local address and the same port number, if connect succeeds, The socket has established a connection with itself: all data written to the socket can be read out on the same jack. This is somewhat similar to a full-duplex pipeline, but only one, not two identifiers. Although very few application processes do this, it is actually a special kind of open at the same time, both of which have the same state transition diagram.
8. Record time stamps

The timestamp options that are received by the Tcp_input processing are given below.

If the received message segment has a timestamp, the timestamp value is saved in the variable.


9.RST processing

The switch statement that handles the RST flag is given below, depending on the current connection state.

1.SYN_RCVD state, the socket error code is set to econnrefused, close the socket.

2. If you receive the RST in the established, Fin_wait_1, fin_wait_2, or close_wait states, the error code Ecooreset is returned.

3. If the status is closing, Last_ack, or time_wait, there is no need to return the error code because the application process has closed the socket.

4. If the SYN flag is still set, the error occurs, the connection is discarded, and the return code Econnreset.

5. If the ACK flag is not placed, the message is discarded.

"TCP/IP Detailed Volume 2: implementation" Note--tcp input

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.