http://www.cnblogs.com/linjiqin/p/3560152.html (reprint)
7 steps to a complete HTTP request
The HTTP communication mechanism is that during a complete HTTP communication, the following 7 steps will be completed between the Web browser and the Web server:
1. Establishing a TCP connection
Before HTTP work begins, the Web browser first establishes a connection to the Web server over the network, which is done through TCP, which works with the IP protocol to build the Internet, known as the TCP/IP protocol family, so the internet is also known as a TCP/IP network. HTTP is a higher level of application-level protocol than TCP, according to the rules, only the lower layer protocol is established before a higher-level protocol connection, therefore, the first to establish a TCP connection, the port number of the general TCP connection is 80.
2. Web browser sends request command to Web server
Once a TCP connection is established, the Web browser sends a request command to the Web server. For example: get/sample/hello.jsp http/1.1.
3. Web browser sends request header information
After the browser sends its request command, it also sends some other information to the Web server in the form of header information, and then the browser sends a blank line to notify the server that it has ended sending the header information.
4. Web server Answer
After the client makes a request to the server, the server responds back to the client, http/1.1, and the first part of the answer is the version number of the protocol and the response status code.
5. The Web server sends the answer header information
Just as the client sends information about itself along with the request, the server also sends the user with the answer about its own data and the requested document.
6. The Web server sends data to the browser
After the Web server sends the header information to the browser, it sends a blank line to indicate that the header information is sent to the end, and then it sends the actual data requested by the user in the format described in the Content-type reply header information.
7. The Web server shuts down the TCP connection
In general, once the Web server sends the request data to the browser, it closes the TCP connection and then if the browser or server joins this line of code in its header: connection:keep-alive
The TCP connection remains open after it is sent, so the browser can continue to send requests through the same connection. Maintaining a connection saves the time it takes to establish a new connection for each request and also saves network bandwidth.
TCP through three handshake to establish a reliable connection, this article does not do much to repeat, here is the main point of the client when initiating an HTTP request, what data is sent in the packet, and the request to successfully receive the Web server packet;
Here you grab the packets that the mobile app sends to the Web server via the Fiddler tool:
HTTP request is Http://xg.mediportal.com.cn/health/sms/verify/telephone
When the browser makes a request to the Web server, it passes a block of data to the server, which is the request for information,
HTTP request information by 3 part of the composition:
1. Request Method (Get/post),URI, Protocol /version
2. Request Header
3. Request Body
To do an analysis of:
POST Http://xg.mediportal.com.cn/health/sms/verify/telephone http/1.1
user-agent:dgrouppatient/1.052701.230/dalvik/2.1.0 (Linux; U Android 5.1.1; Kiw-al10 Build/honorkiw-al10)
content-type:application/x-www-form-urlencoded; Charset=utf-8
Host:xg.mediportal.com.cn
Connection:keep-alive
Accept-encoding:gzip
Content-length:33
telephone=15527177736&usertype=1&
(1) Request method,URI, Protocol /version
The first line of the request is "method,URL, protocol /version":
POST Http://xg.mediportal.com.cn/health/sms/verify/telephone http/1.1
In the code above, "POST" represents the request method, "Http://xg.mediportal.com.cn/health/sms/verify/telephone" represents the URI, "http/ 1.1 represents the version of the Agreement and Protocol.
HTTP requests can use a variety of request methods, depending on the HTTP standard. For example:HTTP1.1 currently supports 7 methods of request:GET,POST,HEAD,OPTIONS,PUT, delete,and Tarce.
GET |
Request to get the resource identified by Request-uri |
POST |
Append new data to the resource identified by Request-uri |
HEAD |
Request for a response message header for a resource identified by Request-uri |
OPTIONS |
Request performance of the query server, or query for resource-related options and requirements |
PUT |
The request server stores a resource and uses Request-uri as its identity |
DELETE |
Requesting the server to delete resources identified by Request-uri |
TRACE |
Request Server Loopback received request information, primary language test or diagnostics |
In Internet applications, the most common method is get and POST. Finally, the protocol version declares the version of HTTP that is used during the communication process.
(2) Requesting header ( request header)
The request header contains many useful information about the client environment and the request body. For example, the request header can declare the language used by the browser, the length of the request body, and so on.
user-agent:dgrouppatient/1.052701.230/dalvik/2.1.0 (Linux; U Android 5.1.1; KIW-AL10 BUILD/HONORKIW-AL10)//Client environment where the user sends the request
content-type:application/x-www-form-urlencoded; Charset=utf-8// format of default submission data for forms
Host:xg.mediportal.com.cn// intenet Host and port number of the requested resource
Connection:keep-alive//Persistent connection
Accept-encoding:gzip//Browser can decode the data encoding method
CONTENT-LENGTH:33//Length of request body
Content-type |
is a very important part of the return message, indicating what MIME type the following document belongs to. Content-type: [Type]/[subtype]; Parameter For example, the most common is text/html, which means that the returned content is a text type, and this text is in HTML format. In principle, the browser will decide how to display the returned message body content according to Content-type. |
Host |
Specifies the intenet host and port number of the requesting resource, which must represent the location of the originating server or gateway that requested the URL. The http/1.1 request must contain the host header domain or the system will return with a 400 status code |
Accept |
Browser acceptable MIME types |
Accept-charset |
Browser-acceptable Character set |
Accept-encoding |
The data encoding that the browser can decode, such as gzip. The servlet can return a GZIP-encoded HTML page to a browser that supports gzip. In many cases this can reduce download time by 5 to 10 times times |
Accept-language |
The type of language the browser wishes to use when the server is able to provide more than one language version |
Authorization |
Authorization information, usually in response to a www-authenticate header sent to the server |
Connection |
Indicates whether a persistent connection is required. If the servlet sees the value here as "keep-alive", or sees the request using HTTP1.1 (HTTP 1.1 is persistent by default), it can take advantage of the persistent connection, when the page contains multiple elements (such as applets, pictures), Significantly reduce the time it takes to download. To do this, the servlet needs to send a content-length header in the answer, and the simplest implementation is to write the content to Bytearrayoutputstream first and then calculate its size before formally writing the content |
Content-length |
Indicates the length of the request message body |
Cookies |
This is one of the most important request header information |
From |
The email address of the requesting sender, which is used by some special Web client, is not used by the browser |
Host |
Hosts and ports in the initial URL |
If-modified-since |
Returns 304 "Not Modified" answer only if the requested content has been modified after the specified date to return it |
Pragma |
Specifying a "No-cache" value means that the server must return a refreshed document, even if it is a proxy server and has a local copy of the page |
Referer |
Contains a URL where the user accesses the currently requested page from the page represented by the URL |
User-agent |
Browser type, this value is useful if the content returned by the servlet is related to the browser type |
Ua-pixels,ua-color,ua-os,ua-cpu |
Non-standard request headers sent by some versions of Internet Explorer to indicate screen size, color depth, operating system, and CPU type |
The common MIME types are as follows:
- application/xhtml+xml:xhtml format
- Application/xml:xml data format
- Application/atom+xml:atom XML Aggregation format
- Application/json:json data format
- Application/pdf:pdf format
- Application/msword:word Document Format
- Application/octet-stream: binary stream data (e.g., common file downloads)
- application/x-www-form-urlencoded: <form enctype= "" > Default enctype,form form data is encoded as key/ The value format is sent to the server (the format of the form's default submission data)
Another common media format is used when uploading files:
- Multipart/form-data: When you need to upload a file in a form, you need to use that format
(3) Request Body
Between the request header and the request body is a blank line, which is very important, which indicates that the request header has ended, followed by the request body. The request body can contain query string information submitted by the customer:
telephone=15527177736&usertype=1&
HTTP response Format
HTTP replies are similar to HTTP requests, andHTTP responses are made up of 3 parts, namely:
1, status line
2. Response header (Response header)
3. Response body
http/1.1 OK//status line
Server:nginx
Date:tue, 02:09:24 GMT
Content-type:application/json;charset=utf-8
Connection:keep-alive
Vary:accept-encoding
Access-control-allow-origin: *
Access-control-allow-headers:x-requested-with,access_token,access-token,content-type,multipart/form-data, application/x-www-form-urlencoded
Access-control-allow-methods:get,post,options
content-length:49
{"ResultCode": 1, "resultmsg": "Phone number not registered"}//Body
(1) Status line
By the protocol version, the status code in the form of a number, and the corresponding status description, each element is separated by a space.
Status code:
The status code consists of 3 digits that indicate whether the request is understood or is satisfied.
Status Description:
The status description gives a short textual description of the status code.
The first number of the status code defines the category of the response, and the following two bits do not have a specific classification.
The first number has five possible values:
-1XX: Indicates information-Indicates that the request has been received and continues processing.
-2xx: Success-Indicates that the request has been successfully received, understood, accepted.
-3xx: Redirect-A further action must be made to complete the request.
-4xx: Client Error-The request has a syntax error or the request cannot be implemented.
-5xx: Server-side error-the server failed to implement a legitimate request.
Status Code Status Description description
OK Client Request succeeded
The bad request is not understood by the server because of a syntax error in client requests.
401 Unauthonzed request is not authorized. This status code must be used with the Www-authenticate header field
The 403 Forbidden server received the request but refused to provide the service. The server typically gives reasons for not serving in the response body
404 Not Found The requested resource does not exist, for example, the wrong URL was entered.
The Internal server error server has unexpected errors that could result in the client's request not being completed.
The 503 Service unavailable server is currently not able to process client requests, and after a period of time the server may return to normal
(2) Response head
The response header may include:
Location :
The Location response header field is used to redirect the recipient to a new position. For example: the client requested the page no longer exists in the original location, in order to redirect the client to the new location of this page, the server can send back to the address of the response header after the use of redirection statements, let the client access to the new domain name corresponding to the resources on the server. When we use the redirect statement in the JSP, the server side sends back the response header to the client, and there is a location response header field.
Server:
The server Response header field contains the software information that the server uses to process the request. It corresponds to the user-agent request header domain, which sends information about the server-side software, which sends the client software (browser ) and the operating system. The following is an example of the Server response header field:server:apache-coyote/1.1
Www-authenticate:
The Www-authenticate response header field must be contained in a 401 (unauthorized ) response message, and the header domain is related to the authorization Request header field mentioned earlier when the client receives a 401 response message, Decide whether to request the server to validate it. If the server is required to validate it, a request containing the authorization header domain can be sent, and here is An example of the Www-authenticate response header field:www-authenticate: Basic realm= "Basic Auth test!"
From this response header domain, you can know that the server side is using the Basic authentication mechanism for the resources we request.
content-encoding :
The Content-encoding Entity header field is used as the modifier for the media type, and its value indicates the additional content encoding that has been applied to the entity body, so the corresponding decoding mechanism must be used to obtain the media type referenced in the Content-type header domain. content-encoding The main terms of the document compression method, here is an example: content-encoding:gzip. If an entity body is stored in an encoded manner, it must be decoded before it is used.
Content-language:
The Content-language Entity header field describes the natural language used by the resource. Content-language allows users to identify and differentiate entities according to their preferred language. If the entity content is intended only for Danish readers, the Entity header field can be set as follows:Content-language:da.
If the content-language header field is not specified, then the entity content is provided to the reader of the language.
Content-length :
The Content-length Entity header field is used to indicate the length of the body, expressed as a decimal number stored in bytes, that is, a numeric character occupies one byte and is transmitted using its corresponding ASCII code storage.
Note that this length is only the length of the entity body and does not include the length of the entity header.
Content-type:
The Content-type Entity header field term indicates the media type that is sent to the recipient's entity body. For example:
Content-type:text/html;charset=iso-8859-1
content-type:text/html;charset=gb2312
Last-modified:
The Last-modified Entity header field is used to indicate the last modification date and time of the resource.
Expires:
The Expires Entity header field gives the date and time when the response expires. Typically, a proxy server or browser caches some pages. When the user accesses these pages again, it is loaded directly from the cache and displayed to the user, which shortens the response time and reduces the load on the server. In order for the proxy server or browser to update the page after a period of time, we can use the Expires Entity header field to specify when the page expires. When the user accesses the page again, if the date and time given by the Expires header field are earlier (or the same) than the date and time given by the date normal header field , then the proxy server or browser will no longer use the cached page but instead request the updated page from the server. Note, however, that even if the page expires, it does not mean that the original resource on the server has changed before or after this time.
The date and time used by the expires Entity header field must be a date format in RFC 1123, for example:
Expires:thu, Sep 2005 16:00:00 GMT
The HTTP1.1 client and cache must treat other illegal date formats (also including 0) as expired. For example, to let the browser do not cache the page, we can also take advantage of the Expires Entity header field, set its value to 0, as follows (JSP):response.setdateheader ("Expires", 0);
Detailed parsing of HTTP requests and HTTP responses