(Transport Layer) TCP protocol

Source: Internet
Author: User
Directory

Header Format
Data Unit
Specific
Note:
Automatic retransmission request ARQ
Implementation
Sending Cache
Receive Cache
Sliding Window
Confirm lost and confirm late
Timeout retransmission time Selection
Message segment sending time
Transportation connection
Send TCP request Client
Concepts related to congestion handling
Avoid congestion
Finite State Machine of TCP

Header Format

Illustration:

Description of fields:

  • Source Port and destination port: Each occupies 2 bytes. The port is the service interface between the transport layer and the application layer. The multiplexing and sharing functions of the transport layer must be implemented through the port.
  • Serial number: 4 bytes. each byte in the data stream transmitted in the TCP connection is a serial number. the value of the serial number field refers to the sequence number of the first byte of the data sent in this section.
  • Confirmation Number: 4 bytes. It is the sequence number of the first byte of the data expected to receive the next packet segment of the other party.
  • Data offset/Header LengthIt indicates how far the starting position of the TCP packet segment is from the starting position of the TCP packet segment. the unit of "data offset" is 32 characters (in 4 bytes)
  • Retained: Takes 6 places, reserved for future use, but should be set to 0 currently
  • URG: When URG = 1, it indicates that the emergency pointer field is valid. It tells the system that there is urgent data in this packet segment and should be transmitted as soon as possible (equivalent to high-priority data)
  • Confirm ACK: The confirmation number field is valid only when ACK = 1. When ACK = 0, the confirmation number is invalid.
  • Psh (push): Receive TCP received PSH = 1 packet segment, it will be delivered to the receiving application process as soon as possible, instead of waiting until the entire cache is filled up before delivery
  • RST (reset): When rst = 1, it indicates that a serious error occurs in the TCP connection (for example, due to host crash or other reasons). You must release the connection and then re-establish the transport connection.
  • Synchronize SYN: Synchronous SYN = 1 indicates a connection request or connection to receive packets
  • Terminate fin: Used to release a connection. Fin = 1 indicates that the data of the sender of this packet segment has been sent, and the transport connection must be released.
  • Inspection: Takes 2 bytes. The range of the test and field test includes the header and data. During the calculation test, 12 bytes of pseudo header must be added before the TCP packet segment.
  • Emergency pointer: 16 bytes, indicating the total number of bytes of emergency data in this section (the emergency data is placed at the top of the data in this section)
  • Option: Variable length. TCP initially only specifies one option, that is, the maximum message segment length (MSS. the MSS tells the peer TCP: "The maximum length of the data field of the packet segment that can be received by my cache is MSS bytes. [MSS (maximum segment size) is the maximum length of the data field in the TCP packet segment. adding the TCP header to the data field is equal to the entire TCP packet segment.]
  • Fill: This is to make the length of the entire header be an integer multiple of 4 bytes.
  • Other options:
    • Window Expansion: 3 bytes, one of which indicates the shift value S. the new window value is equal to the number of bits in the TCP Header increasing to (16 + S). It is equivalent to moving the window value to the left and then obtaining the actual window size.
    • Timestamp: 10 bytes. The most important field timestamp Value Field (4 bytes) and timestamp return answer field (4 bytes)
    • Select OK: The receiver receives two non-consecutive two bytes from the previous byte stream. if the serial numbers of these bytes are within the receiving window, the receiver will first accept the data, but the information must be accurate to the sender so that the sender does not repeatedly send the received data.

 

Data Unit

The data unit protocol transmitted over TCP is a TCP segment (segment)

 

Features

TCP is a connection-oriented transport layer protocol.
Each TCP connection can have only two endpoints, and each TCP connection can only be point-to-point (one-to-one)
TCP provides reliable delivery services
TCP provides full-duplex communication
Byte stream oriented

 

Note:

TCP does not care how long a packet is sent to the TCP cache by an application process.
TCP determines the number of bytes of a packet segment based on the window value given by the peer party and the current degree of network congestion (the length of the packet sent by UDP is given by the application process)
TCP can divide too long data blocks into shorter data blocks for further transmission. TCP can also wait for the accumulation of enough bytes before forming a packet segment to be sent out.
Each TCP connection has two endpoints.
The TCP connection endpoint is not the host, not the Host IP address, not the application process, or the transport layer protocol port. The TCP connection endpoint is called socket or socket.

 

Automatic retransmission request ARQ

Definition:

Reliable transmission protocols are often called automatic retransmission request (ARQ)

Cumulative confirmation:

  • Definition: the recipient generally uses the cumulative confirmation method. that is, you do not have to send confirmation to the received group one by one, but send confirmation to the last group that arrives in order. This means that all groups until this group have correctly received
  • Advantage: it is easy to implement and does not need to be re-transmitted even if the loss is confirmed.
  • Disadvantage: it cannot reflect to the sender the information of all groups correctly received by the receiver.

Go-back-N (rollback N ):

If the sender sends the first five groups, the 3rd groups in the middle are lost. at this time, the receiver can only send confirmation to the first two groups. the sender cannot know the whereabouts of the next three groups, but has to re-send the next three groups.

 

Implementation

Note:

  • Each end of the TCP connection must have two windows, one sending window and one receiving window.
  • The TCP reliable transmission mechanism controls byte serial numbers. All TCP validation is based on serial numbers instead of packet segments.
  • The four windows at both ends of TCP are often dynamically changing.
  • The round-trip time RTT of TCP connections is not fixed. You need to use a specific algorithm to estimate the reasonable retransmission time.

Illustration:

 

Sending Cache

Sending cache for temporary storage:

  • Send the data that the application program sends to the sender TCP to prepare for sending
  • TCP has sent but has not received confirmation data

Illustration:

 

 

Receive Cache

The receiving cache is used for temporary storage:

  • Data that has arrived in sequence but has not been read by the receiving application;
  • Unordered data

Illustration:

 

Sliding Window

Illustration:

Features:

  • Sliding Window in bytes
  • A's sending window is not always as big as B's receiving window (because of a certain time lag)

Requirements:

  • The TCP standard does not specify how data that does not arrive in order should be processed. it is usually first temporarily stored in the receiving window, and then delivered to the upper-layer application process in order after the missing bytes in the byte stream are received.
  • TCP requires the receiver to have a cumulative validation function, which can reduce the transmission overhead.

Specific implementation:

 

Confirm lost and confirm late

 

Timeout retransmission time Selection

Specific implementation:

Each time TCP sends a packet segment, a timer is set for this packet segment. As long as the retransmission time set by the timer has not been confirmed yet, the packet segment must be retransmitted.

Weighted average round-trip time:

Practice:

TCP retains a weighted average round-trip RTTs (also called a smooth round-trip time) of RTT. When the RTTs sample is measured for the first time, the RTTs value is the RTT sample value. after a new RTT sample is measured, the RTTs is recalculated as follows:

Formula:

New RTTs = (1-α) × (old RTTs) + α (New RTT sample)

Note:

Formula, 0 ≤ α <1. If α is very close to zero, it indicates that the RTT value is updated slowly. If α is selected to be close to 1, it indicates that the RTT value is updated quickly.
The recommended α value for RFC 2988 is 1/8, that is, 0.125.

Timeout retransmission RTO:

RTO should be slightly greater than the weighted average round-trip time RTTs obtained above.
For RFC 2988, we recommend that you use the following formula to calculate RTO:

RTO = RTTs + 4 × rttd

Rttd is the weighted average value of RTT deviation.
RFC 2988 suggests calculating rttd in this way. During the first measurement, the rttd value is half of the RTT sample value. In subsequent measurements, the following formula is used to calculate the weighted average rttd:

New rttd = (1-β) × (old rttd) + β× | RTTs-New RTT sample |

Beta is a coefficient smaller than 1, and its recommended value is 1/4, that is, 0.25.
When calculating the average round-trip time RTT, as long as the packet segment is re-transmitted, the round-trip time sample is not used.

Corrected Karn algorithm:

The RTO is increased every time the packet segment is re-transmitted:

New RTO = gamma X (old RTO)

The typical value of coefficient gamma is 2.
When the message segment is no longer re-transmitted, the average round-trip delay RTT and the timeout retransmission time RTO values are updated based on the round-trip delay of the message segment.

Continuous Timer

  • TCP provides a continuous timer for each connection.
  • As long as the TCP connection party receives the zero-window notification from the other party, the continuous timer starts.
  • If the duration specified by the timer expires, a zero-window detection packet segment (containing only 1 byte of data) is sent ), the other party gives the current window value when confirming this detection packet segment.
  • If the window is still zero, the party that receives the packet segment will reset the timer.
  • If the window is not zero, the deadlock can be broken.

 

Message segment sending time

TCP maintains a variable that is equal to the maximum message segment length (MSS). When the data stored in the cache reaches the MSS byte, it is assembled into a TCP packet segment and sent out.
The sender's application process specifies the request message segment, that is, the push operation supported by TCP.
When the sender's timer expires, the existing cached data is loaded into the packet segment (but the length cannot exceed MSS) and sent out.

 

Transportation connection

Three phases:

  • Connection establishment:

    • Illustration:

    • Steps:

      • Tcp of a sends a connection request packet segment to B. the synchronous SYN in the header is 1 and the sequence number seq = X indicates that the sequence number of the First Data byte during data transmission is X.
      • After TCP of B receives the connection request packet segment, if agreed, it will be sent back for confirmation (B should make SYN = 1 in the confirmation packet segment, so ACK = 1, its confirmation number ACK = x 1_1, and its selected sequence number seq = y)
      • After receiving this packet segment, a confirms to B, whose ACK = 1, and the confirmation number ACK = y 1271 (A's TCP notifies the upper-layer application process that the connection has been established, after receiving confirmation from host a, TCP of host B also notifies its upper-layer application process that the TCP connection has been established)
  • Data Transmission
  • Connection release:
    • Illustration:

    • Steps:

      • After the data transmission is completed, both parties can release the connection. now, the application process of a first releases the packet segment to its TCP, stops sending data, and proactively closes the TCP connection (A releases the fin = 1 of the packet segment header, its sequence number seq = u, waiting for confirmation from B)
      • B sends a confirmation, and the confirmation number ACK = u + 1, and this packet segment's own serial number seq = V (the TCP server process notifies the high-level application process. the connection from A to B is released, and the TCP connection is semi-closed. b. if data is sent, a still needs to receive the data)
      • If B has no data to send to a, its application process notifies TCP to release the connection.
      • After receiving the connection release packet segment, a must send a confirmation. In the confirmation packet segment, ACK = 1, ACK = W ﹢ 1, and its serial number seq = u + 1
    • Note:

The TCP connection must be released after 2msl (2msl time Intention--- To ensure that the last ACK packet segment sent by a can reach B. prevent "invalid Connection Request Message segment" from appearing in this connection. after sending the last ACK packet segment and then passing through 2msl, A can make all the packet segments generated during the connection duration disappear from the network. in this way, the old connection request packet segment will not appear in the next new connection)

    • Handling of lost confirmation:

Three questions:

  • So that each party can know the existence of the other party
  • Both parties are allowed to negotiate some parameters (such as the maximum segment length, maximum window size, and service quality)
  • Ability to allocate transport physical resources (such as cache size and projects in the connection table)

 

Send TCP request Client

 

Concepts related to congestion handling

Congestion window:

Meaning:

The size of the congestion window depends on the degree of network congestion and is dynamically changing. the sender makes the sending window equal to the congestion window. if the receiving capacity of the receiver is considered, the sending window may be smaller than the congestion window.

Principles for the sender to control the congestion window:

As long as the network is not congested, the congestion window is increased to send more groups. however, as long as the network is congested, the congestion window is reduced to reduce the number of groups injected into the network.

Reduced multiplication:

Whether in the slow start or congestion avoidance phase, if a timeout occurs (that is, a network congestion occurs), the slow start threshold ssthresh is set to the current congestion window value multiplied by 0.5.

Addition increase:

After the congestion avoidance algorithm is executed, after receiving confirmation of all the packet segments (that is, after a round-trip time), the congestion window cwnd is added to an MSS size, increases the congestion window slowly to prevent premature network congestion.

Fast retransmission:

The fast retransmission algorithm requires the receiver to send a duplicate validation immediately after receiving an out-of-order packet segment. this allows the sender to know that a packet segment has not arrived at the receiver as soon as possible. As long as the sender receives three repeated confirmations, the sender should immediately re-transmit the packet segment that the other party has not yet received.

Fast Recovery:

When the sending end receives three consecutive duplicates, it executes the "Multiplication and reduction" algorithm to cut the slow start threshold ssthresh by half. However, it does not execute the slow start algorithm.

Upper Limit of the sending window:

The upper limit of the sender's sending window should be one of the two smaller variables, namely, the receiver's rwnd and the congestion window cwnd, which should be determined by the following formula:
Upper Limit of the sending window Min [rwnd, cwnd]

    • When rwnd <cwnd, It is the maximum number of sending windows restricted by the recipient's receiving capability.
    • When cwnd <rwnd, It is the maximum number of transmission windows restricted by network congestion.

Avoid congestion

Slow Start Algorithm:

  • When the host just starts sending the packet segment, you can set the congestion window cwnd = 1, that is, set it to the value of the maximum message segment MSS.
  • After receiving a confirmation message segment, add 1 to the congestion window, that is, add an MSS value.
  • After the slow start algorithm is used, the congestion window cwnd doubles for every transmission round (round-trip time RTT)

Congestion Avoidance algorithm:

The cwnd of the congestion window increases slowly, that is, the cwnd of the sender is added to 1 every time the RTT passes through a round-trip time, so that the cwnd of the congestion window grows slowly according to the linear law.

Slow Start threshold ssthresh usage:

  • Use the slow start algorithm when cwnd <ssthresh
  • When cwnd> ssthresh, stop using the slow start algorithm and use the congestion avoidance algorithm instead.
  • When cwnd = ssthresh, you can use either the slow start algorithm or the congestion avoidance algorithm.

When the network is congested (the confirmation is not received on time ):

  • Set the slow start threshold ssthresh to half of the sender's window value when congestion occurs (but not less than 2)
  • Then set the congestion window cwnd to 1. Execute the slow start algorithm.

 

Finite State Machine of TCP

Note:

  • In the diagram of the TCP finite state machine, each box is a possible state of TCP.
  • The upper-case English string in each box is the TCP connection status name used by the TCP standard. The arrows between the statuses indicate possible status changes.
  • The word next to the arrow indicates the cause of this change or the action that occurs after the state change
  • There are three different arrows in the figure.
    • The thick line arrow indicates normal changes to the customer Process
    • The bold dotted line arrows indicate normal changes to server processes
    • Another thin arrow indicates abnormal changes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.