Analysis of the cause of TCP build-up number Golang high concurrency

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Background: The service needs high frequency to make a GET request, and then we encapsulate the Golang net/http Library, because open source, such as req and gorequsts are encapsulated net/http, so we still choose native (req improper use will also fall in the pit). Our scenario is that the multi-process takes a task from Chan, concurrent get requests, and then sets the timeout, sets the agent, and is done. We know that the net/http comes with a connection pool, and can automatically reclaim the connection, but found that the connection soared, starting with 10,000 connections.

First of all, our first version of the code is based on Python, is not connected to the problem of skyrocketing, encapsulated requests, encapsulated as follows:

def fetch (self, URL, body, method, Proxies=none, header=none): res = None Timeout = 4 Self.er  Ror = "Stream_flag = False if not header:header = {} if not proxies:proxies = {} Try:self.set_extra (header) res = Self.session.request (method, URL, Data=body, headers= Header, Timeout=timeout, proxies=proxies) # to do:self.error variable to logger except Requests.exceptions. Timeout:self.error = "Fetch faild!!! URL:{0} except:connect Timeout ". Format (URL) except requests.exceptions.TooManyRedirects:self.error =" FETCH FAILD!!! URL:{0} except:redirect more than 3 times ". Format (URL) except Requests.exceptions.ConnectionError:self . Error = "Fetch FAILD!!! URL:{0} except:connect Error ". Format (URL) except socket.timeout:self.error =" Fetch faild!!!     url:{0} except:recv timetout ". Format (URL) except:       Self.error = "Fetch faild!!! Url:{0} except: {1} ". Format (URL, Traceback.format_exc ()) If Res is not None and self.error = =" ": SELF.L  Ogger.info ("URL:%s, Body:%s, Method:%s, Header:%s, Proxy:%s, request success!", url, str (body) [:], method, header, Proxies) self.logger.info ("URL:%s, Resp_header:%s, Sock_ip:%s, response success!", URL, res.headers, SELF.G  ET_SOCK_IP (res)) else:self.logger.warning ("URL:%s, Body:%s, Method:%s, Header:%s, Proxy:%s, error: %s, reuqest failed! ", url, str (body) [: +], method, header, proxies, Self.error) return res

After switching to Golang, we chose Net/http. Looking at net/http documents, the most basic requests, such as get,post can be used in the following ways:

resp, err := http.Get("http://example.com/")resp, err := http.Post("http://example.com/upload", "image/jpeg", &buf)resp, err := http.PostForm("http://example.com/form",url.Values{"key": {"Value"}, "id": {"123"}})

We need to add timeout, proxy and set head head, the official recommendation is to use the client way, as follows:

client := &http.Client{     CheckRedirect: redirectPolicyFunc,     Timeout: time.Duration(10)*time.Second,//设置超时}client.Transport = &http.Transport{Proxy: http.ProxyURL(proxyUrl)} //设置代理ipresp, err := client.Get("http://example.com")req, err := http.NewRequest("GET", "http://example.com", nil) //设置header req.Header.Add("If-None-Match", `W/"wyzzy"`)resp, err := client.Do(req)

The official documentation here indicates that the client only needs to be globally instantiated and then the process is secure, so it is possible to use the shared client to send req using a multi-process approach.

According to the official documentation, and our business scenario, we have written the following business code:

var client *http.Client//初始化全局clientfunc init (){client = &http.Client{Timeout: time.Duration(10)*time.Second,  }}type HttpClient struct {}//提供给多协程调用func (this *HttpClient) Fetch(dstUrl string, method string, proxyHost string, header map[string]string)(*http.Response){    //实例化reqreq, _ := http.NewRequest(method, dstUrl, nil)    //添加headerfor k, v := range header {req.Header.Add(k, v)}    //添加代理ipproxy := "http://" + proxyHostproxyUrl, _ := url.Parse(proxy)client.Transport = &http.Transport{Proxy: http.ProxyURL(proxyUrl)}resp, err := client.Do(req)return resp, err}

When we use the pool and open 100 worker calls to fetch (), it is supposed that the established connection should be 100, but, when I press the test, I find that the established connection block to 10,000, net/ The connection pool for HTTP doesn't work at all? Suppose this is where the usage is not right.

There is no problem with using Python's library concurrent requests, so where is this problem? In fact, if you are familiar with the process of Golang Net/http Library, it is clear that the problem is in the above transport, each transport maintain a connection pool, our code each process will be new one transport, so that will continue to create new connections.

Let's look at the data structure of transport:

type Transport struct {    idleMu     sync.Mutex    wantIdle   bool // user has requested to close all idle conns    idleConn   map[connectMethodKey][]*persistConn     idleConnCh map[connectMethodKey]chan *persistConn    reqMu       sync.Mutex    reqCanceler map[*Request]func()    altMu    sync.RWMutex    altProto map[string]RoundTripper // nil or map of URI scheme => RoundTripper    //Dial获取一个tcp 连接,也就是net.Conn结构,    Dial func(network, addr string) (net.Conn, error)}

The structure of the two map, save is different protocol different host, to different request mapping. Obviously, this structure should be as global as the client. Therefore, in order to avoid the use of connection pool failure, is not constantly new transport!

        We are constantly new transport the reason is to set up the agent, here can not use this way, how to achieve the purpose? If you know the principle of the agent, we are here to solve the fact is very simple, request the use of IP, host with the domain name on OK. The code is as follows:

var client *http. Clientfunc init () {client = &http. Client{}}type HttpClient struct {}func newhttpclient () (*httpclient) {HttpClient: = Httpclient{}return &httpclient} Func (this *httpclient) Replaceurl (srcurl string, IP String) (string) {httpprefix: = "/http" parsedurl, err: = URL. Parse (Srcurl) if err! = Nil {return ""}return Httpprefix + IP + parsedurl.path}func (this *httpclient) downLoadFile (resp *h Ttp. Response) (Error) {//err Write/dev/null:bad file descriptor#out, err: = OS. OpenFile ("/dev/null", OS. O_rdwr|os. O_create|os. O_append, 0666) defer out. Close () _, err = Io. Copy (out, resp. Body) return Err}func (this *httpclient) Fetch (Dsturl string, method String, ProxyHost string, header map[string]string, PR eload bool, timeout Int64) (*http. Response, error) {//ProxyHost change URL in request Newurl: = This.replaceurl (Dsturl, ProxyHost) req, _: = http. Newrequest (method, Newurl, nil) for k, V: = Range Header {Req. Header.add (k, v)}client. Timeout = time. Duration (timeout) *time. SECONDRESP, Err: = client. Do (req) return resp, err} 

    with host in the header, the number of TCP builds here immediately drops to the same number of pools, and the problem is resolved.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.