TCP connection survival at the backend of the server and tcp survival at the backend of the server

Source: Internet
Author: User

TCP connection survival at the backend of the server and tcp survival at the backend of the server

0. Background

The company's server backend is deployed in a certain place, connected to the user's APP, and the network signal in this place is poor, resulting in the failure of the user access after the server backend is running for a period of time, colleagues over there reported that netstat was used to view the system, and there were many TCP connections.

1. Problem Analysis

First, deploy the service on the internal testing server of the company and use LoadRunner for stress testing to run normally. Then, the colleagues from the company reported that the signal in this area is poor. Considering the access issue, the FD resources of the access process may be exhausted, resulting in an accept failure. The reasoning is that for TCP connections, if the client side fails to initiate a FIN close message to the server due to some exceptions, if the server side does not set a survival check, the connection will exist (the survival time is not tested yet ).

2. Experiment Test

Here, a simple program is written for the server. The main function is to respond, that is, accept a message (Format: 2 byte Packet Length + packet content), and then return the packet content to the client intact.

 1 #include <stdio.h>
  2 #include <sys/types.h>
  3 #include <sys/socket.h>
  4 #include <sys/epoll.h>
  5 #include <unistd.h>
  6 #include <pthread.h>
  7 #include <stdlib.h>
  8 #include <string.h>
  9 #include <arpa/inet.h>
 10 
 11 int g_epfd;
 12 
 13 int InitServer( unsigned short port )
 14 {
 15     int nServerFd = socket( AF_INET, SOCK_STREAM, 0 );
 16 
 17     struct sockaddr_in addr;
 18     memset( &addr, 0, sizeof(addr) );
 19 
 20     addr.sin_family = AF_INET;
 21     addr.sin_port = htons( port );
 22     addr.sin_addr.s_addr = 0;
 23 
 24     if ( bind( nServerFd, (struct sockaddr *)&addr, sizeof(addr) ) <0 )
 25     {
 26         printf("bind error\n");
 27         exit(-1);
 28     }
 29 
 30     if ( listen( nServerFd, 128 ) < 0 )
 31     {
 32         printf("listen error\n");
 33         exit(-1);
 34     }
 35 
 36     return nServerFd;
 37 }
 38 
 39 int AddFd( int epfd, int nFd , int nOneShot)
 40 {
 41     struct epoll_event event;
 42     memset( &event, 0, sizeof( event) );
 43 
 44     event.data.fd = nFd;
 45     event.events |= EPOLLIN | EPOLLRDHUP | EPOLLET;
 46 
 47     if ( nOneShot ) event.events |= EPOLLONESHOT;
 48 
 49     return epoll_ctl( epfd, EPOLL_CTL_ADD, nFd, &event );
 50 }
 51 
 52 int ResetOneShot( int epfd, int nFd )
 53 {
 54     struct epoll_event event;
 55     memset( &event, 0, sizeof(event) );
 56 
 57     event.data.fd = nFd;
 58     event.events |= EPOLLIN | EPOLLRDHUP | EPOLLONESHOT;
 59 
 60     return epoll_ctl( epfd, EPOLL_CTL_MOD, nFd, &event);
 61 }
 62 
 63 void * ReadFromClient( void * arg )
 64 {
 65     int nClientFd = (int)arg;
 66     unsigned char buf[1024];
 67     const int nBufSize = sizeof( buf );
 68     int nRead;
 69     int nTotal;
 70     int nDataLen;
 71 
 72     printf("ReadFromClient Enter\n");
 73 
 74     if ( (nRead = read( nClientFd, buf, 2 )) != 2 )
 75     {
 76         printf("Read Data Len error\n");
 77         pthread_exit(NULL);
 78     }
 79 
 80     nDataLen = *(unsigned short *)buf;
 81     printf("nDataLen [%d]\n", nDataLen);
 82     nDataLen = buf[0]*256 + buf[1];
 83     printf("nDataLen [%d]\n", nDataLen);
 84 
 85     nRead = 0;
 86     nTotal = 0;
 87     while( 1 )
 88     {
 89         nRead = read( nClientFd, buf + nRead, nBufSize );
 90         if ( nRead < 0 )
 91         {
 92             printf("Read Data error\n");
 93             pthread_exit( NULL );
 94         }
 95         nTotal += nRead;
 96         if ( nTotal >= nDataLen )
 97         {
 98             break;
 99         }
100     }
101     printf("nTotal [%d]\n", nTotal);
102 
103     sleep(5);
104 
105     int nWrite = write( nClientFd, buf, nTotal );
106     printf("nWrite[%d]\n", nWrite);
107 
108     printf("Not Write ResetOneShot [%d]\n", ResetOneShot(g_epfd, nClientFd));
109 
110     return NULL;
111 }
112 
113 int main(int argc, char const *argv[])
114 {
115     int i;
116     int nClientFd;
117     pthread_t tid;
118     struct epoll_event events[1024];
119 
120     int nServerFd = InitServer( 7777 );
121     if ( nServerFd < 0 )
122     {
123         perror( "nServerFd" );
124         exit(-1);
125     }
126 
127     int epfd = epoll_create( 1024 );
128 
129     g_epfd = epfd;
130 
131     int nReadyNums;
132 
133     if ( AddFd( epfd, nServerFd, 0 ) < 0 )
134     {
135         printf("AddFd error\n");
136         exit(-1);
137     }
138 
139     while( 1 )
140     {
141          nReadyNums = epoll_wait( epfd, events, 1024, -1 );
142 
143          if ( nReadyNums < 0 )
144          {
145              printf("epoll_wait error\n");
146              exit(-1);
147          }
148 
149          for ( i = 0; i <  nReadyNums; ++i)
150          {
151              if ( events[i].data.fd == nServerFd )
152              {
153                  nClientFd = accept( nServerFd, NULL, NULL );
154 
155                  AddFd( epfd, nClientFd, 1 );
156 
157              }else if ( events[i].events & EPOLLIN )
158              {
159                 // Can be implemented by threadpool
160                  //Read data from client
161                 pthread_create( &tid, NULL, ReadFromClient, (void *)(events[i].data.fd) );
162 
163              }else if ( events[i].events & EPOLLRDHUP )
164              {
165                  //Close By Peer
166                 printf("Close By Peer\n");
167                 close( events[i].data.fd );
168              }else
169              {
170                 printf("Some thing happened\n");
171              }
172 
173          }
174     }
175 
176     return 0;
177 }

 

 

 

Test content:

Note: Client IP: 192.168.10.108 Server IP & Port: 192.168.10.110: 7777

 

A. the client sends a message to the server and then disconnects the network. (Here I made some changes to the program. This experiment commented out the write Response to prevent the write impact on the test. The next experiment will use write ).

After the client is disconnected from the network, use netstat to check whether the client and server are still in the established status ,.


A. Experiment results

The server does not detect that the client is disconnected and is still in the connection status.

 

B. The client sends a message to the server, then breaks the network, closes the client, and repeats the message again.

In this test, if the program establishes a Socket connection again, the previous connection is detected.


B. experiment conclusion:

The connection will not be detected until the Program establishes a Socket connection again.

 

C. The client sends a message to the server and then disconnects the network. (This experiment uses the write Response to view the write results ).

The Write operation was successful .....


C. experiment conclusion:

This write operation does not check whether the peer end is disconnected.

 

3. Solution

Temporary: Use the TCP option SO_KEEPALIVE to check whether the client has been abnormal (setsockopt ).

Subsequent improvements: Use heartbeat packets to detect persistent connection survival issues.

Note: SO_KEEPALIVE will be added tomorrow. When I go home, only one notebook is installed with Ubuntu directly, and no virtual machine is installed, it will not hurt.

 

4. Supplement

If there is anything wrong or I suggest you directly talk about it, it is better to discuss it more.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.