Tcp connection recovery after network disconnection

Source: Internet
Author: User
Article title: tcp connection recovery after network disconnection. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.

Encountered a problem in the project. The two machines use a socket to establish a TCP connection for two-way communication, causing heavy traffic. in this case, the network is disconnected by setting a packet loss rate of 100% on the router. at this time, the socket certainly cannot send packets, and a large number of retransmission occurs. then, cancel the settings on the vro and restore the network. as a result, the traffic from the TCP connection client to the server is normal, but the traffic from the server to the client fails, whatever you do, the returned value is 0, and errno is EAGAIN.

I used tcpdump to check the package data at this time (tc2 is server, tc1 is client ):

12:08:21. 020291 IP tc1.corp.com. 42171> tc2.corp.com. 3003: S 4009389430: 4009389430 (0) win 5840

12:08:21. 020571 IP tc2.corp.com. 3003> tc1.corp.com. 42171: R 0: 0 (0) ack 4009389431 win 0

12:08:38. 934329 IP tc2.corp.com. 3903> tc1.corp.com. 3904: P 2398055392: 2398056153 (761) ack 2538876742 win 724

12:08:38. 934519 IP tc1.corp.com. 3904> tc2.corp.com. 3903:. ack 2165 win 13756

12:08:39. 958457 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 1: 763 (762) ack 2165 win 13756

12:08:39. 958485 IP tc2.corp.com. 3903> tc1.corp.com. 3904:. ack 763 win 1448

12:08:39. 958653 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 763: 881 (118) ack 2165 win 13756

12:08:39. 958660 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 881: 997 (116) ack 2165 win 13756

12:08:39. 958719 IP tc2.corp.com. 3903> tc1.corp.com. 3904:. ack 997 win 1448

12:08:39. 958890 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 997: 1114 (117) ack 2165 win 13756

12:08:39. 958898 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 1114:1232 (118) ack 2165 win 13756

12:08:39. 958903 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 1232: 1349 (117) ack 2165 win 13756

12:08:39. 958971 IP tc2.corp.com. 3903> tc1.corp.com. 3904:. ack 1349 win 1448

12:08:39. 959141 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 1349: 1466 (117) ack 2165 win 13756

12:08:39. 959149 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 1466: 1583 (117) ack 2165 win 13756

12:08:39. 959154 IP tc1.corp.com. 3904> tc2.corp.com. 3903: P 1583: 1700 (117) ack 2165 win 13756

12:08:39. 959222 IP tc2.corp.com. 3903> tc1.corp.com. 3904:. ack 1700 win 1448

Tc2 does not send its own data, but just blindly ACK the data from tc1, waiting for half an hour, still so. Why is it not sent?

The final result is that we set TCP_NODELAY on the socket. Remove this setting and restart the program. after the network is disconnected and restored, TCP works normally in both directions. You can also use tcpdump to see:

16:05:38. 782427 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904: P 0: 887 (887) ack 1 win 26064

16:05:38. 782619 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 3783 win 25352

16:05:38. 782634 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 3783: 5231 (1448) ack 1 win 26064

16:05:38. 782637 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 5231: 6679 (1448) ack 1 win 26064

16:05:38. 782890 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 5231 win 25352

16:05:38. 782896 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 6679: 8127 (1448) ack 1 win 26064

16:05:38. 782898 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 8127: 9575 (1448) ack 1 win 26064

16:05:38. 782901 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 6679 win 25352

16:05:38. 782904 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 9575: 11023 (1448) ack 1 win 26064

16:05:38. 783183 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 8127 win 25352

16:05:38. 783188 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 11023: 12471 (1448) ack 1 win 26064

16:05:38. 783191 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 9575 win 25352

16:05:38. 783193 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 12471: 13919 (1448) ack 1 win 26064

16:05:38. 783196 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 11023 win 25352

16:05:38. 783199 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 13919: 15367 (1448) ack 1 win 26064

16:05:38. 783201 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 15367: 16815 (1448) ack 1 win 26064

16:05:38. 783502 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 12471 win 25352

16:05:38. 783506 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 16815: 18263 (1448) ack 1 win 26064

16:05:38. 783509 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 13919 win 25352

16:05:38. 783512 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 18263: 19711 (1448) ack 1 win 26064

16:05:38. 783514 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 15367 win 25352

16:05:38. 783517 IP tc2.corp.alimama.com. 3903> tc1.corp.alimama.com. 3904:. 19711: 21159 (1448) ack 1 win 26064

16:05:38. 783519 IP tc1.corp.alimama.com. 3904> tc2.corp.alimama.com. 3903:. ack 16815 win 25352

Tc2 sent its own data stream this time, and tc1 started to send data to its ACK. after a while, tc1 started to send data, and the last two-way was normal.

Why cannot the socket with TCP_NODEALY be recovered after the network is ready?

Let's look at the implementation of the recv system call (2.6.9 kernel), which is traced back to the tcp_recvmsg function:

[Net/ipv4/tcp. c --> tcp_recvmsg]

813 while (-- iovlen> = 0 ){

814 int seglen = iov-> iov_len;

815 unsigned char _ user * from = iov-> iov_base;

816

817 iov ++;

818

819 while (seglen> 0 ){

820 int copy;

821

822 skb = sk-& gt; sk_write_queue.prev;

823

824 if (! Sk-> sk_send_head |

825 (copy = mss_now-skb-> len) <= 0 ){

826

[1] [2] [3] Next page

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.