HTTP protocol Detailed

Source: Internet
Author: User
Tags send cookies time and date

What is the HTTP protocol

Protocol refers to the rules or rules that must be adhered to in communication between two computers in a computer communication network, Hypertext Transfer Protocol (HTTP) is a communication protocol that allows Hypertext Markup Language (HTML) documents to be delivered from a Web server to the client's browser. We are currently using the http/1.1 version.

The HTTP protocol is a stateless protocol. The continuation connection was used in the http/1.1. The so-called persistent connection is that the World Wide Web server is sending a response for a period of time to maintain this connection so that the same client and the server can continue to transmit subsequent HTTP request messages and response messages on this connection. The http/1.1 protocol has two ways of working: Non-pipelined and pipelined. Non-pipelined: The customer receives the previous response before making the next request. Pipelining: A customer can then send a new request message before receiving an HTTP response message.

Web server, browser, proxy server

When we open the browser, enter the URL in the Address bar, and then we see the page. What is the principle?

In fact, after we enter the URL, our browser sends a request to the Web server, the Web server receives the request, processes it, generates the corresponding response, sends it to the browser, and the browser parses the HTML in the response. So we see the Web page, as shown in the process

It is possible that our request was passed through a proxy server and finally arrived at the Web server. The process is as shown

Proxy Server is a transit point of network information, what is the function?

1. Improve access speed, most of the proxy server has the cache function.

2. Break the limit, that's FQ.

3. Hide identities.

URL detailed

The URL (Uniform Resource Locator) address is used to describe a resource on a network with the following basic format

schema://host[:p ort#]/path/.../[?query-string][#anchor]
    • Scheme specifies the protocol used by the lower layer (for example: HTTP, HTTPS, FTP)
    • The IP address or domain name of the host HTTP server
    • The default port for the port# HTTP server is 80, in which case the lower number can be omitted. If you use a different port, you must specify, for example, http://www.cnblogs.com:8080/
    • Path to access resource
    • Query-string data sent to the HTTP server
    • anchor-Anchor

An example of a URL

Http://www.mywebsite.com/sj/test/test.aspx?name=sviergn&x=true#stuffschema:                 httphost:                   Www.mywebsite.compath:                   /sj/test/test.aspxquery String:           name=sviergn&x= Trueanchor:                 Stuff
View Code

The HTTP protocol is stateless

HTTP protocol is stateless, the same client's request and the last request is not the corresponding relationship, for the HTTP server, it does not know that the two requests from the same client. To solve this problem, the Web program introduces a cookie mechanism to maintain state.

Opening a webpage requires the browser to send it many times reques
    1. When you enter the URL http://www.cnblogs.com in the browser, the browser sends a request to get the http://www.cnblogs.com HTML. The server sends the response back to the browser.
    2. The browser parses the HTML in response and discovers that it references a lot of other files than slices, CSS files, and JS files.
    3. The browser will automatically send the request again to get a picture, CSS file, or JS file.
    4. After all the files have been downloaded successfully. The Web page is displayed.
Structure of the HTTP message

The request message is divided into 3 parts, the first part is called Request line, the second part is called the request header (the requestor), and the third part is the body (entity body). There is a blank line between the header and the body, as the structure

The method in the first line represents the request methods, such as "POST", "get", Path-to-resoure represents the requested resource, Http/version-number represents the version number of the Http protocol when using the "GET" method, The body is empty. For example, we open the Blog Garden home page request below.

GET http://www.cnblogs.com/HTTP/1.1Host:www.cnblogs.com
View Code

Abstract things, difficult to understand, the old feeling is virtual, the so-called seeing is real, actually see things, we can understand and remember. Today we use Fiddler, actually look at request and response. Below we open fiddler capture a blog Park login request then analyze its structure under the inspectors You can see the complete request message under the tab in raw mode, such as

Let's look at the structure of the response message, which is basically the same as the structure of the request message. Also divided into three parts, the first part is called Response line, the second part is called response header, the third part is the body. There is also a blank line between the header and the body, as the structure

Http/version-number represents the version number, Status-code, and message of the HTTP protocol, see the detailed explanation of the next section [Status code]. We used fiddler to capture a blog home of the response and then analyze its structure, You can see the complete response message under inspectors tab in raw mode, such as

The difference between the Get and post methods

The HTTP protocol defines a number of ways to interact with the server, the most basic of which are 4, get,post,put,delete, respectively. A URL address is used to describe a resource on a network, and the Get, POST, PUT, delete in HTTP corresponds to the search for this resource, change, increase, delete 4 operations. Our most common is get and post. Get is typically used to get/query resource information, and post is typically used to update resource information. Let's see the difference between get and post

    • Get submitted data is placed after the URL, to split the URL and transfer data, the parameters are connected with &, such as editposts.aspx?name=test1&id=123456. The Post method is to put the submitted data in the body of the HTTP packet.
    • The data size for get commits is limited (because the browser has a limit on the length of the URL), and there is no limit to the data submitted by the Post method.
    • The Get method needs to use Request.QueryString to get the value of the variable, and the Post method takes the value of the variable by Request.Form.
    • The Get method submits the data, which brings security problems, such as a login page, when the data is submitted via get, the user name and password will appear on the URL, and if the page can be cached or someone else can access the machine, the user's account and password can be obtained from the history record.
Status code

The first line in the Response message is called the status line, which consists of the HTTP protocol version number, the status code, and the status message. The status code is used to tell the HTTP client whether the HTTP server has generated the expected 5 class status code in the response.http/1.1, the status code is composed of three digits, and the first number defines the category of the response

    • 1XX hint Message-Indicates that the request was successfully received and continues processing
    • 2XX Success-Indicates that the request has been successfully received, understood, accepted
    • 3XX Redirect-further processing is required to complete the request
    • 4XX Client Error-Request syntax error or request not implemented
    • 5XX server-side error-the server failed to implement a legitimate request
Common Status Codes

The most common is the successful response status Code 200, which indicates that the request was successfully completed and the requested resource was sent back to the client. For example, open the blog Garden Home

302 Found. Redirect, the new URL will be returned in the location in response, and the browser will automatically send a new request using the new URL. For example, enter in IE, http://www.google.com. The HTTP server returns 302, IE takes the new URL to the location header in response and sends a request again.

304 Not Modified. On behalf of the last document has been cached, you can continue to use, such as Open the blog home page, found that many response status code is 304

  Tip: If you don't want to use a local cache, you can force the page to refresh with Ctrl+f5

    • Error request client requests and syntax errors cannot be understood by the server
    • 403 Forbidden server receives request, but refuses to provide service

404 Not Found Request resource does not exist (the wrong URL was lost). For example, enter an incorrect URL in IE, http://www.cnblogs.com/tesdf.aspx

    • An unexpected error occurred on the Internal server error server
    • 503 Server Unavailable Server is currently unable to process client requests and may return to normal after some time
HTTP Request Header

With Fiddler you can easily view the reques header, click Inspectors tab->request tab-> headers as shown.

Header There are many, more difficult to remember, we also follow the Fiddler as the header classification, so clear and easy to remember.

Cache header Field

If-modified-since. Effect: The last modification time of the browser-side cache page is sent to the server, and the server compares this time with the last modification time of the actual file on the server. If the time is the same, then return 304, the client uses the local cache file directly. If the time is inconsistent, 200 and the new file contents are returned. After the client receives it, it discards the old files, caches the new files, and displays them in the browser.

For example: If-modified-since:thu, 09:07:57 GMT

Real example

If-none-match. Function: If-none-match works with the ETag, which works by adding etag information to the HTTP response. When the user requests the resource again, the If-none-match information (the value of the ETag) is added to the HTTP request. If the server verifies that the etag of the resource has not changed (the resource is not updated), it returns a 304 status that tells the client to use the local cache file.  Otherwise, the 200 state and the new resource and ETag are returned. Using such a mechanism will improve the performance of your site. For example: If-none-match: "03f2b33c0bfcc1:0". Examples such as

Pragma. Role: Prevent the page from being cached, in the http/1.1 version, it and Cache-control:no-cache function exactly the same pargma only one usage, for example: Pragma:no-cache Note: In the http/1.0 version, Only realized the Pragema:no-cache, did not implement Cache-control

Cache-control. Role: This is a very important rule. This is used to specify the caching mechanism that response-request follows. Each instruction has the following meanings:

    • Cache-control:public can be cached by any cache ()
    • Cache-control:private content is cached only in the private cache
    • Cache-control:no-cache All content is not cached
Client Header Domain

Accept. Role: The browser can accept the media type, for example: Accept:text/html on behalf of the browser can accept server postback type is text/html, that is, we often say HTML document, if the server cannot return text/html type of data, The server should return a 406 error (non acceptable). The wildcard character * represents any type. For example, Accept: */* on behalf of the browser can handle all types, (the general browser to the server is the issue of this)

Accept-encoding: function: The browser affirms its own received encoding method, usually specifies the compression method, whether compression is supported, what compression method is supported (Gzip,deflate), (note: This is not a character encoding); For example: Accept-encoding:gzip , deflate

Accept-language. Role: The browser affirms the language it receives. Language and Character set differences: Chinese is a language, Chinese has a variety of character sets, such as BIG5,GB2312,GBK and so on; for example: accept-language:en-us

User-agent. Role: Tells the HTTP server, the name and version of the operating system and browser that the client is using. For example: user-agent:mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; trident/4.0; CIBA;. NET CLR 2.0.50727;. NET CLR 3.0.4506.2152;. NET CLR 3.5.30729;. net4.0c; infopath.2;. NET4.0E)

Accept-charset function: The browser affirms its own character set, which is the various character sets and character encodings described earlier in this article, such as gb2312,utf-8 (usually we say CharSet includes the corresponding character encoding scheme).

Cookie/login header Field

Cookie: Function: The most important header to send the value of the cookie to the HTTP server

Entity header Field

Content-length Effect: The length of the data sent to the HTTP server. Example: content-length:38

Content-type Effects: For example: content-type:application/x-www-form-urlencoded

Miscellaneous header Field

Referer: The server that provides the context information for the request, tells the server which link I took from, such as linking to a friend from my home page, and his server is able to get from the HTTP Referer the number of users who clicked on the link on my homepage to visit his website every day.

Example: REFERER:HTTP://TRANSLATE.GOOGLE.CN/?HL=ZH-CN&TAB=WT

Transport header Field

Connection Example: connection:keep-alive when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, it continues to use the established connection. For example: Connection:close represents the completion of a request, the TCP connection between the client and the server for transmitting HTTP data is turned off, and the TCP connection needs to be re-established when the client sends the request again.

Host (the header domain is required when sending a request): The request header domain is primarily used to specify the Internet host and port number of the requested resource, which is usually extracted from the HTTP URL. For example: We enter in the browser: the Http://www.guet.edu.cn/index.html browser sends the request message, it will contain the Host request header field, as follows: host:http:// www.guet.edu.cn The default port number 80 is used here, if a port number is specified, it becomes: Host: Specify port number

HTTP Response Header

Also use Fiddler to view Response header, click Inspectors tab->response tab-> headers as shown

Cache header Field

Date effect: The exact time and date when the message was generated. Example: Date:sat, 11:35:14 GMT

Expires effect: The browser uses the local cache for the specified expiration time. For example: Expires:tue, 2022 11:35:14 GMT

Vary Effects: For example: vary:accept-encoding

Cookie/login header Field

P3P. Role: Used to set cookies across domains, which resolves an issue where the iframe accesses cookies across domains. Example: P3p:cp=cura ADMa DEVa Psao psdo our BUS UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR

Set-cookie. Role: A very important header, used to send cookies to the client browser, each write cookie generates a Set-cookie. For example: set-cookie:sc=4c31523a; path=/; Domain=.acookie.taobao.com

Entity header Field

ETag function: Used in conjunction with If-none-match. (See examples of If-none-match in the section for example): ETag: "03f2b33c0bfcc1:0"

Last-modified: function: Used to indicate the last modification date and time of a resource. (See examples of if-modified-since in the section for example): last-modified:wed, Dec 09:09:10 GMT

Content-type function: The Web server tells the browser the type and character set of the object it responds to, for example: content-type:text/html; Charset=utf-8;content-type:text/html;charset=gb2312;content-type:image/jpeg

Content-length. Indicates the length of the entity body, expressed as a decimal number stored in bytes. In the process of data downlink, content-length the way to pre-cache all the data in the server, and then all the data peremptorily to the client. For example: content-length:19847.

Content-encoding. The Web server indicates what compression method (Gzip,deflate) It uses to compress the objects in the response. Example: Content-encoding:gzip

Content-language role: The Web server tells the browser to respond to the language of the object for example: Content-language:da

Miscellaneous header Field

Server: function: Indicates the software information of the HTTP server. Example: server:microsoft-iis/7.5

X-aspnet-version: function: If the Web site is developed with ASP, this header is used to represent the version of ASP. For example: x-aspnet-version:4.0.30319

X-powered-by: The role of the website is to use what technology developed. Example: X-powered-by:asp.net

Transport header Field

Connection. Example: connection:keep-alive when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, it continues to use the established connection. For example: Connection:close represents the completion of a request, the TCP connection between the client and the server for transmitting HTTP data is turned off, and the TCP connection needs to be re-established when the client sends the request again.

Location Header Field

Location function: Used to redirect a new position, including a new URL address instance see 304 status instances

The HTTP protocol is a stateless and connection:keep-alive difference

Stateless means that the protocol has no memory capacity for transactions, and the server does not know what the client state is. On the other hand, there is no connection between opening a Web page on a server and the Web page you previously opened on this server HTTP is a stateless connection-oriented protocol, stateless does not mean that HTTP is not able to maintain TCP connections, and cannot represent HTTP using UDP protocol (no connection). From http/1.1 onwards, the default is to open the keep-alive, to maintain the connection characteristics, in short, when a Web page opens, the client and server for the transmission of HTTP data between the TCP connection will not be closed, if the client again access to the Web page on this server, will continue to use this established connection. Keep-alive does not permanently keep the connection, it has a hold time that can be set in different server software (such as Apache).

StatementThis article reprinted from the Tank blog address http://www.cnblogs.com/TankXiao/

HTTP protocol Detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.