Probe into the problem of Curl:curle_couldnt_connect

Source: Internet
Author: User
Tags socket connect

Excerpt from:: Storage System Research: Socket connect error (cannot assign request address)

This is a recent stress test using the Libcurl write HTTP service, the direct appearance is that the client failed to send an HTTP request, the final reason is the client's time_wait state of the socket process too much, causing the port to be full. Here's a look at the entire analysis process:

(1) First look at the source code that produces the error:
/* Get it! */res = Curl_easy_perform (curl_handle);Long Http_code =0;Curl_easy_getinfo (Curl_handle, Curlinfo_response_code, &http_code); /* Cleanup Curl Stuff * /Curl_easy_cleanup (curl_handle);  if (res! = CURLE_OK | | Http_code! = ) { cout << URI << ", res =" << 
                   
                    res << ", 
                    http_code =" << http_code << Endl; }  return (res == Curle_ok && http_code == );    
                   

The error log is as follows:

http://10.237.92.30:8746/thumbnail/jpeg/l820/appstore/b262b95f-95b8-4e0e-b4e0-edc3b76e3c81, res = 7, Http_code = 0http:// 10.237.92.30:8746/thumbnail/jpeg/l820/ Appstore/a4c37951-d8b5-40ff-af27-4efcd1a58e71, res = 7, Http_code = 0http:// 10.237.92.30:8746/thumbnail/jpeg/l820/ Appstore/abab08ff-75e1-40da-a113-053789e93686, res = 7, Http_code = 0         

Review the error code for Curllib, as follows, error code Curle_couldnt_connect

0,  Curle_unsupported_protocol,/    * 1 */  curle_failed_init,/             * 2 */  Curle_url_malformat,           /* 3 *  /curle_not_built_in,            /* 4-[was obsoleted in August-*/* 9 A service was denied     by the server

(2) Analysis of the cause of Curl_easy_perform return error

The most straightforward way to use GDB to track the operation of the client, found that the client in connect when the return error, in the source file curl-7.28.1/lib/ Connect.c in the Singleipconnect function, so join the log after connect print errno, the code is as follows:

if (!isconnected && (Conn->socktype = = Sock_stream)) {    connect (SOCKFD, &addr.sa_addr, Addr.addrlen);    if (-1 = = RC) {      error = Sockerrno;      printf (1) curl_expire (data, conn->timeoutms_per_addr);    

Run the test program again to get the following output:

Connect failed with errno =99http://127.0.0.1:8902/thumbnail/jpeg/l820/appstore/f8913ca1-ae5f-4fcc-abc5-cbe9ada1a67d,Ret_code:0,Res:7connect failed with errno =99http://127.0.0.1:8902/thumbnail/jpeg/l820/Appstore/3726a1e2-057e-402d-b347-61C5A5136CD9,Ret_code:0,Res:7connect failed with errno =99 http://127.0.0.1:8902/thumbnail /jpeg/l820/appstore/c19bad67- 6b7d-4dc6-a17a-f74ea525c32a, ret_code: 0, res: 7connect failed with errno = 99 http://127.0.0.1:8902/thumbnail/jpeg/l82 0/appstore/5d778568-d873- 46a7-9651-ad8ac3810bf4, ret_code: 0, res: 7         

You can see errno = 99, in the kernel's include/asm-generic/errno.h file you can see errno = 99 interpreted as "Cannot assign requested address".

#Define Eafnosupport  */Address family not supported by protocol *#define eaddrinuse  98 /  * Address already in use */#Define Eaddrnotavail */* Cannot assign requested Address */#define E Netdown/* Network       is down */
(3) The cause of errno = 99;

As for why the Connect system call failed to return, only the implementation of the system call can be seen.

A) Connect system call

The Connect system call is implemented in NET/SOCKET.C and the call stack for the Sys_connect system call is as follows:

Sys_connect--->    sock->ops->connect                   //inet_stream_connect        Sk->sk_prot->connect               //tcp_v4_connect 

The main purpose of Tcp_v4_connect is to complete the first handshake in the TCP connection three times, which is to send the connection request package to the server with sync = 1 and a 32-bit sequence number. To send the sync request packet, according to TCP/IP protocol, must have the source IP address and port, source IP address selection and routing related, need to query the routing table, implemented in Ip_route_connect, the choice of source port is implemented in __inet_hash_connect, And if no available port is found, this function returns-eaddrnotavail, so it is basically possible to determine that the function returned an error causing connect to fail;

b) __inet_hash_connect

The main purpose of this function is to select an available port, the main implementation steps are as follows:

I. Call Inet_get_local_port_range (&low, &high); Get the list of available ports;

    1. Call Read_seqbegin (&sysctl_local_ports.lock); get sequential lock;
    2. Get low and high for available ports:

*low = sysctl_local_ports.range[0];

*high = sysctl_local_ports.range[1];

II. For each port, perform the following steps:

    1. Find this port in Inet_hashinfo *hinfo Inet_hashinfo is used to save the port information used, each port used in this hash table has a entry;
    2. Hash the port to get the list header (use the list to resolve the hash conflict)
    3. Traverse each of the entry in a linked list:

A) determine if the same port you want to use, if the same goes to step B, if it is not the same, traverse the next entry

b) Locate this port and call Check_established (__inet_check_established) to determine if the port can be reused (time_wait port and net.ipv4.tcp_tw_recycle = 1 is port can be reused)

    1. If this port is not found in the linked list, indicating that the port is not being used, call inet_bind_bucket_create to insert a entry in the hash table;

III. Returns EADDRNOTAVAIL if no available port is found at the end;

As can be seen from the implementation of this function, the main reason is that the available ports are full, so there is no available port, which causes the connection to fail. Running Netstat can find sockets that do have a lot of time_wait status, and these sockets will be full share population the available end.

[Root@test Miuistorage-dev]print key,"\ T",State[key]}' time_wait 26837ESTABLISHED  
(4) Workaround:

To resolve a problem where the port is time_wait and the socket is full, you have the following workaround:

A) Modify the available port range

To view the current port range:

Root@guojun8-desktop:/linux-2.6.  61000    

To modify the port range:

Root@guojun8-desktop:linux-2.6.  # sysctl net.ipv4.ip_local_port_range= "32768    62000     

This approach may not solve the underlying problem, because if you use a short connection, even increasing the available ports will be full.

b) Set net.ipv4.tcp_tw_recycle = 1

This parameter indicates whether the system's time-wait sockets can be quickly recycled

Root@guojun8-desktop:linux-2.6.  1    

c) Set net.ipv4.tcp_tw_recycle = 1

This parameter indicates whether the port of the TIME_WAIT state can be reused;

Root@guojun8-desktop:linux-2.6.  # [Root1      ]
(5) More in-depth discussion: what Sysctl did

You can use Strace to track the SYSCTL system calls:

Root@guojun8-desktop:linux-2.6.34# strace sysctl net.ipv4.tcp_tw_recycle=1Execve ("/sbin/sysctl", ["Sysctl","Net.ipv4.tcp_tw_recycle=1"], [/*VARs */]) =0Brk0) =0x952f000 .....Open"/proc/sys/net/ipv4/tcp_tw_recycle", o_wronly| O_creat| O_trunc,0666) =3Fstat64 (3, {st_mode=s_ifreg|0644, st_size=0, ...}) =0MMAP2 (NULL,4096, prot_read| Prot_write, map_private| Map_anonymous,-1,0) =0xb788e000Write3,"1\n",2) =2Close3) = 0munmap (0xb788e000, Span class= "number" >4096) = 0fstat64 (1, {st_mode= s_ifchr| 0620, St_rdev=makedev (136, 8), ...}) = 0mmap2 (NULL, 4096, prot_read| Prot_write, map_private| Map_anonymous,-1, 0) = 0xb788e000 write (1,  "Net.ipv4.tcp_tw_recycle = 1\n", 28net.ipv4.tcp_tw_recycle = 1) = 28 exit_group (0) =?          

You can see this program open/proc/sys/net/ipv4/tcp_tw_recycle and write 1 to the file, but how does this setting work? The I_fop of the files in the/proc/sys directory is handled in the kernel, and is set in Proc_sys_make_inode: Inode->i_fop = &proc_sys_file_operationsproc_ Sys_file_operations is defined as follows:

struct File_operations proc_sys_file_operations = {. read   = Proc_sys_read,.write    = Proc_sys_write,};

Proc_sys_write will modify the corresponding file, and modify the contents in memory, different files have different proc_handler, such as tcp_tw_recycle corresponding handler function is Proc_dointvec, this function will modify the following variables:

Tcp_death_row.sysctl_tw_recycle

This variable in the kernel indicates whether the socket of the Time_wiat state can be quickly recycled.

Probe into the problem of Curl:curle_couldnt_connect

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.