HTTP protocol implementation in Golang

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

HTTP protocol implementation in Golang

Wrote a crawler, found that there is a case of a socket leak. Baidu a bit found is missing Response.Body.Close (), so lead to connection
is not properly closed. It has not been recycled by GC. Here are the instructions in the documentation

Callers should close resp.Body when done reading from it. If resp.Bodyis not closed, the Client's underlying RoundTripper (typically Transport)may not be able to re-use a persistent TCP connection to the server for asubsequent "keep-alive" request.

Solving the problem was simple, but it caused me to look at the desire to see how simple HTTP requests are implemented in the source code.

    • Entry function
    • Send function
    • Transport.roundtrip function
    • Transport.altproto
    • Transport.connectmethod
    • Transport.getconn function
    • Transport.getidleconn function
    • Transport.dialconn function
    • Persistconn Structural Body
    • Persistconn.roundtrip function
    • Transport idle connections in the fabric body
    • transport.dial function
    • Persistconn.readloop function

Do functions (including post,get)

First we built a request with Newrequest, which contains the URL we requested, and if the POST request contains the requested body,
A dofollowingredirects function is then triggered, but here we do not expand to simplify, directly see the situation without redirection, that is,
Continue to pass this request down through the Client.send function

Send function

The Client.send function is a wrapper over the Send function to extract the cookie in the client cookie Jar into the request, and
The cookie returned in response is loaded into the cookie Jar of the client.

func send(ireq *Request, rt RoundTripper, deadline time.Time) (*Response, error)

When Client.send calls send, transport is passed in as an RT parameter, and if not, it will be used in Transport.go.
The default defaulttransport.

Then send did a little work to detect the incomplete request,setrequestcancel (if the timeout time is set to timeout then this function will take effect, the first time you read
This timeout is stopped, and if the request has been cancel at this point, an error is returned.
The roundtrip function of RT is then called to obtain the response.

Transport.roundtrip function

First check the information integrity of the request and see if there is a scheme-compliant roundtrip implementation in Altproto. Then enter the for loop to build a
Connectmethod type variable, and then transport.getconn to get a TCP connection, and then by calling Persistconn.roundtrip to
Request is written to TCP to complete the send requests. If the send fails, call Checktransportresend to attempt to resend the request.

Transport.altproto

At first I did not understand what this was doing, and later found a registerprotocol function to see what it was doing. Transport as a reusable struct can actually handle requests for different protocols, then requests for different protocols will have different implementations, such as Ftp,file. If this is the case, we can register some implementations for different protocols through Registerprotocol, so that the transport can be used to determine which roundtrip to use before sending the request.

Transport.connectmethod+

The struct includes the proxy address, the protocol (HTTP or HTTPS), and the destination address. It is important to note that the Connectmethod type is critical,
It is not only the key value of some maps in transport, but also the parameters of many functions. The same structure as its connectmethodkey contains the same content, but the structure of the body
The type of the internal variable is different (the proxy in Connectmethodkey is a string, and the proxy in Connectmethod is *url. URL)

Transport.getconn function

The Getidleconn function is first used to obtain available idle connections, and if so, to return directly. If not, create a dialconn using Go (async) and then
Channel to send it back to the Getconn function. In Getconn, a select block is used to wait for the return. The more complicated mechanism in the whole function is the decision of the condition, such as the request timed out.
Connection still does not return, this time the function calls Handlependingdial to handle the connection, put it in the idle queue, or close it. Or is it when we ask for
Connection did not return and there was an idle connection, calling handlependingdial waiting for the connection that we requested to return this idle.

Transport.getidleconn function

About two maps in transport for idle connections, searching for Idleconn, returning the first if there are multiple, not returning nil

Transport.dialconn function

First create a variable of type Persistconn, then detect scheme, if it is TLS,HTTPS or use a proxy, then through the DIALTLS function to create
Conn, we don't explain this process here. If it is a normal HTTP, then the transport.dial to get this conn. We only look at the HTTP processing process and find the direct
Skipped 80 lines + inside the function. The read-write buffer of the persistconn is then created into the struct body. Open Read and write functions of Persistconn asynchronously (Readloop and Writeloop)

Persistconn

The note has been written in a very comprehensive way and I'm going to be a porter.

Persistconn wraps a connection, usually a persistent one//(but could be used for non-keep-alive requests as well) type PE    rsistconn struct {//Alt optionally specifies the TLS Nextproto roundtripper.    This was used for HTTP/2 today and the future protocol laters.    If it ' s non-nil, the rest of the fields is unused. Alt Roundtripper T *transport cacheKey connectmethodkey Conn Net. Conn tlsstate *tls. ConnectionState BR *bufio. Reader//From Conn saweof BOOL//Whether we ' ve seen EOF from Conn; Owned by Readloop bw *bufio. Writer//To Conn Reqch Chan Requestandchan//written by roundtrip; Read by Readloop Writech Chan writerequest//written by roundtrip; Read by Writeloop Closech Chan struct{}//closed when Conn closed isproxy bool//Writeerrch passes T He request write error (usually nil)//from the Writeloop goroutine to the readloop which passes//it offto the Res. Body Reader, which then uses it to decide//whether or not a connection can be reused.   Issue 7569. Writeerrch chan error lk sync. Mutex//Guards following fields numexpectedresponses int closed error/SET NON-NIL when Conn was close D, before Closech is closed broken bool//An error have happened on this connection;   Marked broken so it's not reused. canceled BOOL//Whether this conn is broken due a cancelrequest reused bool//whether con      N have had successful request/response and is being reused. Mutateheaderfunc is a optional func to modify extra//headers on each outbound request before it ' s written. (The//original Request given to roundtrip are not modified) Mutateheaderfunc func (Header)}

Persistconn.roundtrip function

First call Replacereqcanceler to detect if the request has triggered a delete behavior, and if so, put the persistconn into Putorcloseidleconn.
In fact, go has a default header when implementing an HTTP request, and a Extraheaders method is implemented in the request. In other words, in this step
The HTTP header will be truly perfected. including accept-encoding (gzip), Range,connection (Close). Then write the request to Writech,
As already mentioned in the Persistconn struct, the recipient of the Writech is Writeloop,writeloop received and then writes it to the buffer and calls flush, which will err through
Channel return. The next roundtrip to write requestandchan,reqch to Reqch is readloop, then the function select hangs several pipes,
It is used to listen for some write errors, the service times out, the connection is closed (or deleted), and the Readloop is sent back to the response. Check the return value if there is no problem after returning response.

Idle connection part of transport structure

idleConn   map[connectMethodKey][]*persistConnidleConnCh map[connectMethodKey]chan *persistConn

The first idleconn is to index a persistconn slice with Methodkey as the key value, and it can be imagined that if we set the maximum idle connection to 5 (perhost),
Then the maximum idle connection we can get through Methodkey should be 5.
Idleconnch is an index to the pipeline that transmits the persistconn, and each time someone waits for a connection, a pipeline is created. When I call Tryputidleconn,
Attempts to put an idle connection that has already been received into the pipeline, and if it is successfully placed, it is returned and dropped in Idleconnch to delete the index. Then put it in Idleconn.

transport.dial function

The dial function is called the Dial func (network, addr string) (net) in the transport struct body. Conn, error). If you didn't create this function,
The default is net. The dial function. This is called the underlying function.

Persistconn.readloop function

First, register a close function with defer to close the Conn and close the Closech in Persistconn to notify Conn to be closed. And then into the loop,
First, Peek (1) is used to detect if an IO error has occurred. Read the Requestandchan type variable in the Persistconn.reqch pipeline, this variable is used to match the request,
and pass in several pipelines as communication. Then call Persistconn.readresponse () to read out the response. After doing some fault-tolerant checks and responsebody
Message pipeline, and finally suspend with SELECT, wait until the Persistconn close or request cancel, or the body of the shutdown, this time will trigger the exit loop
or continue the circular instruction. So the problem that was originally caused by not writing Response.Body.Close () is here.

The realization of persistconn.readresponse;
The realization of readresponse;

Summarize

The first time to look at the source to solve the problem, the problem is quickly resolved. This indicates that most of the problems are explained and annotated in the source code. To tell the truth, I see a lot of difficulty,
I wrote a circle to find that his writing is not particularly friendly to the reader, more of the source of a simplified version of the translation. Low levels inevitably make mistakes, and expect that if there is a great God
See can point out my mistakes and also welcome the problem of the (GAO) stream (JI)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.