Compile an HTTP request tool instance

Source: Internet
Author: User

When the HTTP protocol works, the client sends a request to the server. After receiving the request, the server generates a response and returns it to the client.
In this communication process, the HTTP protocol is defined in the following four aspects:
1. Request and response formats
Request format:

HTTP request line
(Request) Header
Empty row
Optional Message Body

Note: The request line and title must end with <CR> <LF> (that is, press enter and then line feed ). The blank line must contain only <CR> <LF> and no other spaces. In HTTP/1.1, all request headers, except host, are optional.

Instance:

GET / HTTP/1.1Host: gpcuster.cnblogs.comUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Language: en-us,en;q=0.5Accept-Encoding: gzip,deflateAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Keep-Alive: 300Connection: keep-aliveIf-Modified-Since: Mon, 25 May 2009 03:19:18 GMT

Response format:

HTTP status line
Response Header
Empty row
Optional Message Body

Instance:

HTTP/1.1 200 OKCache-Control: private, max-age=30Content-Type: text/html; charset=utf-8Content-Encoding: gzipExpires: Mon, 25 May 2009 03:20:33 GMTLast-Modified: Mon, 25 May 2009 03:20:03 GMTVary: Accept-EncodingServer: Microsoft-IIS/7.0X-AspNet-Version: 2.0.50727X-Powered-By: ASP.NETDate: Mon, 25 May 2009 03:20:02 GMTContent-Length: 12173

For more information about the message body, see RFC 2616. For a brief introduction to HTTP headers, see quick reference to HTTP headers.

2. Connection Establishment Method

HTTP supports establishing connections in 2: non-persistent connections and persistent connections (http1.1 uses persistent connections by default ).

1) non-persistent connection

Let's take a look at the steps for transferring a web page from the server to the customer in the case of non-persistent connections. Assume that the bay surface consists of a basic HTML file and 10 JPEG images, and all these objects are stored on the same server host. Assume that the URL of the basic HTML file is gpcuster.cnblogs.com/index.html.

Follow these steps:

1. The HTTP client initializes a TCP connection to the HTTP server in gpcuster.cnblogs.com. The HTTP server uses the default port 80 to listen for connection establishment requests from HTTP clients.

2. the HTTP client sends an HTTP request message through the local socket associated with the TCP connection. This message contains the path name/somepath/index.html.

3. the HTTP server receives the request message through the local socket associated with the TCP connection, and then retrieves the object/somepath/index.html from the memory or hard disk of the server host, sends a response message containing the object through the same socket.

4. the HTTP server informs TCP to close the TCP connection (however, TCP will terminate the connection only after the customer receives the Response Message ).

5. The HTTP client receives the response message through the same socket. The TCP connection is subsequently terminated. The message indicates that the encapsulated object is an HTML file. After the customer extracts the file and analyzes it, it finds that 10 JPEG objects are referenced.

6. Repeat steps 1-4 for each referenced JPEG object.

The above steps are called using a non-persistent connection because each time the server sends an object, the corresponding TCP connection is closed, that is, each connection does not last until other objects can be transferred. Each TCP connection is used to transmit only one request message and one response message. In the preceding example, each time a user requests a Web page, 11 TCP connections are generated.

2) persistent connection

Non-persistent connections have some disadvantages. First, the customer must establish and maintain a new connection for each object to be requested. For each such connection, TCP must allocate a TCP buffer on the client and server, and maintain the TCP variable. This severely increases the burden on Web servers that may provide services for requests from hundreds of different customers at the same time. Second, as mentioned above, each object has two RTT response extensions-one RTT is used to establish a TCP connection, and the other RTT is used to request and receive objects. Finally, each object is slowed down by TCP because each TCP connection starts from the slow start phase. However, the use of parallel TCP connections can partially reduce the RTT latency and slow startup latency.

In the case of persistent connections, after the server sends a response, the TCP connection continues to open. Subsequent requests and responses to the same client/server can be sent through this connection. The entire web page (in the preceding example, a page containing a basic htmll file and 10 images) can be sent through a single persistent TCP connection: even multiple web pages stored on the same server can be sent through a single persistent TCP connection. Generally, the HTTP server shuts down a connection after a specific period of time, which can be configured during this period. Persistent connections are divided into two versions: Without pipelining and with pipelining. If it is a version without a pipeline, the customer sends a new request only after receiving the response from the previous request. In this case, each object referenced by the web page (10 images in the previous example) experiences a RTT delay, which is used to request and receive the object. Compared with the latency of two RTTs for non-persistent connections, persistent connections without pipelines have been improved, but persistent connections with pipelines can further reduce the response latency. Another disadvantage without the assembly line version is that the server sends an object and waits for the next request, but the new request cannot arrive immediately. During this time, the server resources are idle.

The default HTTP/1.1 mode uses persistent connections with pipelines. In this case, each time an HTTP client encounters a reference, it immediately sends a request. Therefore, an HTTP client can send a request next to each referenced object. After receiving these requests, the server can send each object one by one. If all requests and responses are sent next to each other, all referenced objects will experience only one RTT delay (instead of the same as the version without pipelines, each referenced object has a RTT delay ). In addition, requests such as server null in persistent connections with pipelines are less time-consuming. Compared with non-persistent connections, persistent connections (whether with or without a pipeline) reduce the response latency of one RTT, and slow startup latency is also relatively small. The reason is that since each object uses the same TCP connection, after the server sends the first object, it does not have to send subsequent objects at the initial slow rate. On the contrary, the server can start sending the next object at the rate at which the first object is sent. In http1.0, A New TCP connection is created for each request and response. After http1.1, the HTTP connection of the first request can be reused,
Persistent connections are supported by default. If the client or server does not support persistent connections, add connection: close to the htt header. If yes, set the header to connection: keep-alive.

The above section briefly describes the HTTP request process and requires a simple httpclient. Note the following:

1) short connection or long connection.
2) Parse and construct the header.
3) Parse and construct the body.
4) Different Resolution Methods of Chunk and Content-Length.
5) Different HTTP methods.
....

The following mainly generates headers Based on the instantiated httprequest and supports post, get, and options methods.

string TC_HttpRequest::encode(){//    assert(_requestType == REQUEST_GET || _requestType == REQUEST_POST || !_originRequest.empty());    ostringstream os;    if(_requestType == REQUEST_GET)    {        encode(REQUEST_GET, os);    }    else if(_requestType == REQUEST_POST)    {        setContentLength(_content.length());        encode(REQUEST_POST, os);        os << _content;    }    else if(_requestType == REQUEST_OPTIONS)    {        encode(REQUEST_OPTIONS, os);    }    return os.str();}

There are two \ r \ n \ headers and bodies \.

void TC_HttpRequest::encode(int iRequestType, ostream &os){os << requestType2str(iRequestType) << " " << _httpURL.getRequest() << " HTTP/1.1\r\n";os << genHeader();os << "\r\n";}

To facilitate all header keys, use \ r \ n to separate them with line breaks.

string TC_Http::genHeader() const{ostringstream sHttpHeader;for(http_header_type::const_iterator it = _headers.begin(); it != _headers.end(); ++it){        if(it->second != "")        {            sHttpHeader << it->first << ": " << it->second << "\r\n";        }}return sHttpHeader.str();}

The following is an httprequest resolution request. Send the constructed HTTP request header to the HTTP server port through TCP socket. Construct a buffer to receive the returned data cyclically until the client receives the complete response package or the server closes unexpectedly.

Int tc_httprequest: dorequest (tc_httpresponse & sthttprsp, int itimeout) {// only supports the short connection mode setconnection ("close"); string ssendbuffer = encode (); string shost; uint32_t iport; gethostport (shost, iport); tc_tcpclient tcpclient; tcpclient. init (shost, iport, itimeout); int iret = tcpclient. send (ssendbuffer. c_str (), ssendbuffer. length (); If (iret! = Tc_clientsocket: em_success) {return iret;} sthttprsp. reset (); string sbuffer; char * stmpbuffer = new char [10240]; size_t irecvlen = 10240; while (true) {irecvlen = 10240; iret = tcpclient. recv (stmpbuffer, irecvlen); If (iret = tc_clientsocket: em_success) sbuffer. append (stmpbuffer, irecvlen); Switch (iret) {Case tc_clientsocket: em_success: If (sthttprsp. incrementdecode (sbuffer) {Delete [] stmpbuffer; return tc_clientsocket: em_success;} continue; Case tc_clientsocket: em_close: Delete [] stmpbuffer; sthttprsp. incrementdecode (sbuffer); Return tc_clientsocket: em_success; default: Delete [] stmpbuffer; return iret ;}} assert (true); Return 0 ;}

Data receiving is divided into two parts. The first part is the header. the header is interpreted to further receive and parse the body content. If the parsing returns false, it indicates that the HTTP response has not received the content and continues to receive the content.

case TC_ClientSocket::EM_SUCCESS:            if(stHttpRsp.incrementDecode(sBuffer))            {                delete []sTmpBuffer;                return TC_ClientSocket::EM_SUCCESS;            }            continue;

When the data is successfully received, put the received buffer into the RESP for parsing:

Bool tc_httpresponse: incrementdecode (string & sbuffer) {// parse the header if (_ headlength = 0) {string: size_type Pos = sbuffer. find ("\ r \ n"); If (Pos = string: NPOs) {return false;} parseresponseheader (sbuffer. c_str (); If (_ Status = 204) {return false;} http_header_type: const_iterator it = _ headers. find ("Content-Length"); If (it! = _ Headers. end () {_ itmpcontentlength = getcontentlength ();} else {// The contentlength is not specified, and the server is received to close the connection _ itmpcontentlength =-1;} _ headlength = POS + 4; sbuffer = sbuffer. substr (_ headlength); // redirection is considered successful if (_ Status = 301 | _ Status = 302 )&&! Getheader ("location "). empty () {return true;} // whether it is chunk encoded _ bischunked = (getheader ("transfer-encoding") = "chunked "); // Delete the eraseheader ("transfer-encoding") in the header;} If (_ bischunked) {While (true) {string: size_type Pos = sbuffer. find ("\ r \ n"); If (Pos = string: NPOs) return false; // find the size of the current chunk string schunksize = sbuffer. substr (0, POS); int ichunksize = strtol (schunksize. c_str (), null, 16); If (ichunksize <= 0) break; // All chunks have received if (sbuffer. length ()> = POS + 2 + (size_t) ichunksize + 2) // receives a complete chunk {// obtains the content of a chunk _ content + = sbuffer. substr (Pos + 2, ichunksize); // delete a chunk sbuffer = sbuffer. substr (Pos + 2 + ichunksize + 2);} else {// did not receive the complete chunk return false;} setcontentlength (getcontent (). length ();} sbuffer = ""; if (_ itmpcontentlength = 0 | _ itmpcontentlength = (size_t)-1) {setcontentlength (getcontent (). length ();} return true;} else {If (_ itmpcontentlength = 0) {_ content + = sbuffer; sbuffer = ""; // automatically enter Content-Length setcontentlength (getcontent (). length (); Return true;} else if (_ itmpcontentlength = (size_t)-1) {_ content + = sbuffer; sbuffer = ""; // automatically enter Content-Length setcontentlength (getcontent (). length (); Return false;} else {// short connection mode, received when the length is greater than the header _ content + = sbuffer; sbuffer = ""; size_t inowlength = getcontent (). length (); // The length of the header is smaller than the received content. You need to add the buffer if (_ itmpcontentlength> inowlength) after parsing. Return false; return true ;}} return true ;}

The parsing if (_ headlength = 0) determines whether the header has started receiving and knows whether

        string::size_type pos = sBuffer.find("\r\n\r\n");        if(pos == string::npos)        {            return false;        }

The header is received and the complete header is parsed. Http status code 2XX indicates that the request is successful. The HTTP 204 (NO content) response indicates that the execution is successful, but no data is returned. The browser does not need to refresh the page or direct the page to a new one.

        parseResponseHeader(sBuffer.c_str());        if(_status == 204)        {            return false;        }void TC_HttpResponse::parseResponseHeader(const char* szBuffer){const char **ppChar = &szBuffer;_headerLine = TC_Common::trim(getLine(ppChar));    string::size_type pos = _headerLine.find(' ');    if(pos != string::npos)    {_version    = _headerLine.substr(0, pos);string left = TC_Common::trim(_headerLine.substr(pos));string::size_type pos1 = left.find(' ');if(pos1 != string::npos){    _status  = TC_Common::strto<int>(left.substr(0, pos));    _about   = TC_Common::trim(left.substr(pos1 + 1));}else{    _status  = TC_Common::strto<int>(left);    _about   = "";}parseHeader(*ppChar, _headers);return;    }    else    {_version = _headerLine;_status  = 0;_about   = "";    }//    throw TC_HttpResponse_Exception("[TC_HttpResponse_Exception::parseResponeHeader] http response format error : " + _headerLine);}

Next, determine the Content-Length of HTTP response. If the returned content is clear, the body field is determined. If not, the server must be received until it is closed.

Http_header_type: const_iterator it = _ headers. Find ("Content-Length"); If (it! = _ Headers. end () {_ itmpcontentlength = getcontentlength ();} else {// The contentlength is not specified, and the server is received to close the connection _ itmpcontentlength =-1;} _ headlength = POS + 4; _ headlength = POS + 4; sbuffer = sbuffer. substr (_ headlength); // extract the body. // redirection indicates that the request is successful if (_ Status = 301 | _ Status = 302 )&&! Getheader ("location "). empty () {return true;} // whether it is chunk encoded _ bischunked = (getheader ("transfer-encoding") = "chunked "); // Delete the eraseheader ("transfer-encoding") in the header; If (it! = _ Headers. end () {_ itmpcontentlength = getcontentlength ();} else {// The contentlength is not specified, and the server is received to close the connection _ itmpcontentlength =-1;} _ headlength = POS + 4; _ headlength = POS + 4; sbuffer = sbuffer. substr (_ headlength); // extract the body. // redirection indicates that the request is successful if (_ Status = 301 | _ Status = 302 )&&! Getheader ("location "). empty () {return true;} // whether it is chunk encoded _ bischunked = (getheader ("transfer-encoding") = "chunked "); // Delete the eraseheader ("transfer-encoding") in the header ");

Next, we will start to receive data cyclically. If it is not a chunk, it can be divided into three situations: 0 (without the body, you can stop receiving),-1 (always receive, know that the server is disabled ), determine the length (after receiving is completed, it can be stopped ):

{If (_ itmpcontentlength = 0) {_ content + = sbuffer; // save the entire content to sbuffer = ""; // clear the receiving cache // automatically enter Content-Length setcontentlength (getcontent (). length (); Return true;} else if (_ itmpcontentlength = (size_t)-1) {_ content + = sbuffer; sbuffer = ""; // automatically enter Content-Length setcontentlength (getcontent (). length (); Return false;} // there is a clear Content-Length else {// short connection mode, which receives _ content + = sbuffer when the length is greater than the header; sbuffer = ""; size_t inowlength = getcontent (). length (); // The length of the header is smaller than the received content. You need to add the buffer if (_ itmpcontentlength> inowlength) after parsing to return false; // If the received byte is greater than Content-Length, it can be stopped; otherwise, return true ;}}

Chunk sharding is more complex:

While (true) {string: size_type Pos = sbuffer. find ("\ r \ n"); If (Pos = string: NPOs) return false; // find the size of the current chunk string schunksize = sbuffer. substr (0, POS); int ichunksize = strtol (schunksize. c_str (), null, 16); If (ichunksize <= 0) break; // All chunks have received if (sbuffer. length ()> = POS + 2 + (size_t) ichunksize + 2) // receives a complete chunk {// obtains the content of a chunk _ content + = sbuffer. substr (Pos + 2, ichunksize); // delete a chunk sbuffer = sbuffer. substr (Pos + 2 + ichunksize + 2);} else {// did not receive the complete chunk return false;} setcontentlength (getcontent (). length ();} sbuffer = ""; if (_ itmpcontentlength = 0 | _ itmpcontentlength = (size_t)-1) {setcontentlength (getcontent (). length ();} return true;

The first two fields of each chunk are the zise of the Chunk and are identified in hexadecimal notation. If the chunk size is 0, no Chunk is available. Otherwise, a chunk is received. If the current Chunk is not complete, then continue receiving and processing the chunk,
If the current Chunk is complete, delete the chunk in the buffer and put it in the content.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.