Transfer from http://blog.sina.com.cn/s/blog_6cf9802d0100xtwv.htmlI. Problems with existing TCP traffic in the case of congestionaccording to RFC793, the TCP protocol is a reliable streaming protocol based on end-to-end design. It is characterized by:1, in the three-time handshake to establish a connection, negotiate the sending and receiving end of the ability to send and receive, sliding window. 2, after the completion of the connection is established, TCP in accordance with the original negotiated window size to send messages. 3, to provide a reliable connection, TCP receiver will use the ACK mechanism to notify the sender of the successful reception of the data. 4, the TCP send side according to the receiving end of the ACK to determine whether the data received correctly, and the message is not ACK resend. from the above characteristics of the TCP protocol can be seen, TCP is an end-to-end protocol, not directly aware of the message in the middle path of the state of transmission. That is, when the network intermediate router drops packets, the TCP protocol is to retransmit the message by perceiving whether there is an ACK or whether there is a duplicate ack. TCP considers all the forwarding devices in the network path as black boxes, so long as the perceived ACK is not received at the specified time, it is considered that the message is discarded by the intermediate link and retransmission of the message to ensure the data reliability. for routers in a network link, the TCP packets that need to be forwarded do not necessarily come from the same host, and the TCP connection between the hosts does not perceive the busy state of the forwarding queue for the intermediate routers. When the intermediate router queue overload causes the packet to be dropped,the TCP connection for all hosts is not immediately perceptible, but after the timer expires, the retransmission of the message begins because no ACK is received. This timer has a relatively long time, usually ranging from a few seconds to a few 10 seconds. Packet drop causes multiple TCP to start reducing the transmit rate, even after a window has been sent, the TCP retransmission timer does not time out, and the entire sending process will occasionally stall. After all TCP degrades performance, the router's forwarding queue congestion is relieved, no longer discards the message, all TCP will increase the transmission rate at the same time, reached a certain level, the router began to discard the message, and repeat the TCP retransmission process. The problem with this phenomenon is:1, packet loss caused by TCP retransmission, the retransmission timer longer, for delay-sensitive applications, affecting user experience. 2, after the packet loss, TCP according to the requirements of RFC793, all TCP starts the avoidance, downward transmission performance, congestion is mitigated, but at this time the network utilization can not achieve the optimal. 3, after the congestion mitigation, TCP in order to obtain the optimal performance of the transmission, and continue to expand the sending window until the discovery drops, repeat the above problem process. second, congestion control of existing TCP1, slow start, TCP in order to detect the actual performance of the network, but also to avoid the start of sending too much data, the use of a sending algorithm. That is, the first to send an MSS message segment, with the ACK of the constant reply, the TCP sender began to amplify the transmission capacity, the algorithm is enlarged in accordance with the exponential mode, when a certain rate of switching to a linear growth mode. 2, Fast retransmission, TCP received a duplicate of the 3 ACK, the retransmission queue will be considered the first segment of the packet was discarded by the network, but because of the repeated 3 ACK received, it is considered that the packet after three packets have been received by the receiving side, do not wait for the retransmission timer timeout, The first message segment in the retransmission queue is directly re-sent. 3, fast recovery, when TCP received 3 repeated ACK, the congestion window is halved, and the subsequent receipt of repeated ACK when the linear increase of the window to ensure the performance of the delivery paper. After a new non-duplicate ACK is received, the TCP connection reverts to the slow-boot status of the sending message. third, router congestion control queueThe forwarding queue of routers in the network usually implements the random Early Detection (RED) function, that is, the router makes a packet loss decision based on the average length of the current forwarding queue, and randomly discards some TCP traffic packets, Instead of waiting for the queue to overflow and discard all of the messages, this is a good way to avoid all TCP simultaneous timeout problems. Due to the average length of the queue to drop packets, rather than the queue full length, it will cause a part of the TCP Backoff, let some TCP first slow down, to ensure that the other TCP usually. Again, the use of random discards, so for all TCP connections is relatively fair. Four, the design concept of ECNThe design target of ECN is defined in RFC3168, which is to realize the congestion of intermediate path through the cooperation of TCP sending and receiving end and intermediate router, and actively slow down the TCP sending rate, so as to avoid the congestion caused by the early packet loss and achieve the maximum utilization of network performance. The problems that can be solved are as follows:1. All TCP senders can detect intermediate path congestion early, and proactively slow down the transmission rate to prevent congestion. 2. On the queue forwarded on the intermediate router, ECN marks the TCP message exceeding the average queue length, and continues forwarding, no longer discards the message. The message is discarded and the TCP retransmission is avoided. 3, due to the reduction of packet loss, TCP does not need to go through a few seconds of 10 seconds of retransmission timer Start message retransmission, improve the user experience of delay-sensitive applications. 4, compared with the network without the ECN function, the network utilization is better and no longer shocks back and forth before overloading and light load. v. ECN changes in IP layer and TCP layermodification of IP header0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DS FIELD, DSCP | ECN FIELD | +-----+-----+-----+-----+-----+-----+-----+-----+ dscp:differentiated Services Codepointecn:explicit Congestion NotificationThe differentiated Services and ECN fields in IP. +-----+-----+ | ECN FIELD | +-----+-----+ECT CE [Obsolete] RFC 2481 names for the ECN bits.0 0 Not-ect0 1 ECT (1)1 0 ECT (0)1 1 CE The ECN Field in IP.The 7th and 8bit res fields in the TOS field of the IP header are redefined as ECN fields, which have four values, described in RFC3168, 00 means that the message does not support ECN, so the router will handle the message in accordance with the original non-ECN message, i.e., overload packet loss. Both values 01 and 10 are the same for routers, indicating that the message supports the ECN feature, and if congestion occurs, the two of the ECN fields will be modified to 11来 to indicate that the message is congested and continues to be forwarded by the router. Please refer to RFC3168 for specific differences between 01 and 10. so the router forwarding side to support ECN, need to have the following new features:1, when congestion occurs, the message for ecn=00, go to the original ordinary non-ECN process, that is, red packet. 2, when the congestion occurs, for ecn=01 or ecn=10 messages, need to be modified to ecn=11, and continue the forwarding process. 3, when the congestion occurs, the message for ecn=11 needs to continue forwarding. 4, in order to ensure and do not support the fairness of the ECN message, in the queue over a certain length, you need to consider the support of ECN messages discarded. modification of TCP header0 1 2 3 4 5 6 7 8 9 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ | | | C | E | U | A | P | T: S | F | | Header Length | Reserved | W | C | T: C | S | S | Y | I | | | | T: E | G | K | H | T | N | N | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+cwr:congestion Window ReduceEce:ecn-echoThe new definition of bytes the TCP Header.For The modification of the host side, the res fields of BIT8 and Bit9 are modified to CWR and ECE first. The design in RFC3168 is as follows:1, the TCP receiver receives the IP header in the ecn=11 tag, and the reply ACK when the ECE bit 1. And in the subsequent ACK total will be ECE bit 1. 2, when the TCP sender receives the ECE bit 1 ACK message, it needs to halve its transmit rate, and when the next message is sent, the CWR bit is placed 1. 3, the receiving end received CWR bit 1 of the message, the subsequent ECE bit will no longer set 1. Repeat this process until you receive the IP header ecn=11 again. 4, the TCP send side when received a ece=1, reduce the sending window, and in this time of RTT will no longer reduce the sending window again. 5. When the TCP receiver responds ACK to the sender, if the ACK is a "pure" ack without data, the IP header must be ecn=00, because TCP does not have a mechanism to respond to a pure ACK, it cannot send congestion notifications for pure ack. 6. For a host that supports IP ECN, the TCP layer needs to set the ECN in the IP header to 01 or 10 when sending the message.vi. ECN compatibility issuesIP Header Compatibilitysince ECN has modified the IP header, the following compatibility issues exist:1, the following RFC in addition to RFCs 731,2474,2780 these three standards can be compatible with the incremental deployment of ECN, other RFC implementations are not compatible with ECN deployment. RFC 791 [RFC791] defined the ToS (Type of Service) octet in the IP header.0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | Precedence | TOS | 0 | 0 | RFC 791 +-----+-----+-----+-----+-----+-----+-----+-----+ RFC 1122 included bits 6 and 7 in the TOS field, though it does notdiscuss any specific with for those-bits: 0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | Precedence | TOS | RFC 1122 +-----+-----+-----+-----+-----+-----+-----+-----+ The IPv4 TOS Octet is redefined in RFC 1349 [RFC1349] as follows: 0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | Precedence | TOS | MBZ | RFC 1349 +-----+-----+-----+-----+-----+-----+-----+-----+ The IPv4 TOS Octet is redefined in RFC 1349 [RFC1349] as follows: 0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DSCP | CU | RFCs 2474,+-----+-----+-----+-----+-----+-----+-----+-----+ 2780 Intermediate Equipmentthe implementation and configuration rules for the intermediate security and management devices, such as firewalls and network administrators, may not be well compatible with current ECN. Requires the intermediary device vendor to modify the code or modify the security configuration. ECN support in a variety of tunnelsfor IP tunnel[rfc2003],rfc3168 explicitly defined the message to the tunnel ingress and egress when the requirements of the ECN field, detailed information please refer to RFC3168, here no longer repeat. The need to add types and fields to the IP SEC Security Association Database (SAD) and Security association properties (SAA) to support ECN negotiation under IP SEC Tunneling is explicitly defined for IP sec[rfc2401],rfc3168. For more information, please refer to RFC3168, which is not mentioned here. The specification for ECN support for MPLS, GRE, L2TP, PPTP and other tunnels is not clearly stated in RFC3168, but RFC3168 mentions that making these tunnels support ECN is not a difficult task. Vii. the incremental deployment of ECN in existing networks1, the router in the network in accordance with the 1999 ECN Draft scheme, will only identify ecn=10 messages as support ECN function, and do not recognize ecn=01 messages, such routers may ecn=01 messages will be processed according to ecn=00 behavior, and finally red packet. It does not affect the normal function of the network. 2, for firewalls, network management and other intermediate security and administrative equipment, its implementation and configuration rules may not be very good with the current ECN compatibility. Requires the intermediary device vendor to modify the code or modify the security configuration. 3, for the host side TCP only one end of the support ECN function, the TCP side of the support ECN need to first try to negotiate the ECN, if the connection is not successful, non-ECN-capable TCP connection negotiation must be done to ensure the backward compatibility of TCP. 4, when the support ECN TCP negotiated non-ECN TCP connection, if the subsequent receipt of ECN messages, should follow the behavior of supporting ECN, to be compatible with early ECN implementation. 5, for IP tunnel and IP sec Tunnel, set up two mode switches, that is, support ECN and do not support ECN, in the case of non-ECN support, ECN messages will be forwarded and discarded in accordance with the original non-support ECN behavior. 6, the ingress and egress in the tunnel must support ECN at the same time or not support ECN, asymmetric processing is not allowed. Eight, ECN security issues1, when the TCP send end multiplicity timer timeout, causing retransmission message sent, the transmission rate has been lowered, so the retransmission of the message IP header ecn=00, the intermediate router is no longer placed ECN mark. To avoid Dos attacks. 2, the TCP receiver in the IP header when the ecn=11, but the TCP serial number is incorrect, the response Ack, the ECE bit should not be placed to avoid Dos attacks.
"Turn" TCP/IP ECN analysis