[Reprint] HTTP protocol details

Source: Internet
Author: User
Tags time and date url example browser cache website performance

Http://www.cnblogs.com/TankXiao/archive/2012/02/13/2342672.html

 

The development technology of today's web programs is truly a battle, ASP. NET, PHP, JSP, Perl, Ajax, and so on. Regardless of the future development of web technology, it is very important to understand the basic protocols for communication between Web applications because they allow us to understand the internal work of Web applications. this article will provide detailed examples of the HTTP protocol, and I hope you will be patient. I also hope it will be helpful for everyone's development or testing work. You can use the fiddler tool to easily capture HTTP requests and HTTP response. For more information about how to use the fiddler tool, see my blog [fiddler tutorial].

 

Reading directory

  1. What is HTTP?
  2. Web server, browser, Proxy Server
  3. URL details
  4. The HTTP protocol is stateless.
  5. HTTP message structure
  6. Difference between get and post Methods
  7. Status Code
  8. HTTP Request Header
  9. HTTP Response Header
  10. The difference between the stateless HTTP protocol and connection: keep-alive
What is HTTP?

Protocol refers to the regulations or rules that must be followed by two computers in a computer communication network. Hypertext Transfer Protocol (HTTP) is a communication protocol, it allows the transfer of Hypertext Markup Language (HTML) documents from the Web server to the client's browser

 

Currently, HTTP/1.1 is used.

Web server, browser, Proxy Server

When we open the browser, enter the URL in the address bar, and then we can see the webpage. What is the principle?

In fact, after we enter the URL, our browser sends a request to the Web server. After receiving the request, the Web server processes the request, generates the corresponding response, and then sends it to the browser, the browser parses the HTML in response, so that we can see the webpage, as shown in the process.

 

 

 

Our request may be sent to the Web server only after it passes through the proxy server.

Shows the process.

 

The proxy server is the transfer station of network information. What functions does it provide?

1. Improve access speed. Most proxy servers have the cache function.

2. Break through the limitations, that is, turning over the wall.

3. Hide the identity.

 

URL details

The URL (Uniform Resource Locator) address is used to describe resources on a network. The basic format is as follows:

schema://host[:port#]/path/.../[;url-params][?query-string][#anchor]

Scheme specifies the protocol used at the lower layer (for example, HTTP, https, and FTP)

IP address or domain name of the Host HTTP Server

Port # The default port number of the HTTP server is 80. In this case, the port number can be omitted. If another port is used, you must specify, for example, http://www.cnblogs.com: 8080/

Path

URL-Params

Data sent from query-string to the HTTP server

Anchor-anchor

 

URL example

http://www.mywebsite.com/sj/test;id=8079?name=sviergn&x=true#stuff

Schema: http
host: www.mywebsite.com
path: /sj/test
URL params: id=8079
Query String: name=sviergn&x=true
Anchor: stuff
Copy code

 

The HTTP protocol is stateless.

The HTTP protocol is stateless. The request of the same client does not correspond to the previous request. For the HTTP server, it does not know that the two requests come from the same client. To solve this problem, the Web Program introduces the cookie mechanism to maintain the status.

 

HTTP message structure

First, let's look at the structure of the request message. The request message is divided into three parts: The first part is the request line, the second part is the HTTP header, and the third part is the empty line between body. header and body. The structure is as follows:

The method in the first line indicates the request method. For example, "Post", "get", path-to-Resoure indicates the requested resource, and HTTP/version-number indicates the HTTP protocol version.

When the "get" method is used, the body is empty.

For example, the request for opening the blog homepage is as follows:

GET http://www.cnblogs.com/ HTTP/1.1
Host: www.cnblogs.com
Copy code

We use Fiddler to capture a request logged on to the blog Park and analyze its structure. The Inspectors tab displays the complete request message in RAW mode, as shown in figure

 

Let's look at the structure of the response message, which is basically the same as the structure of the request message. It is also divided into three parts: The first part is request line, the second part is request header, and the third part is an empty line between body. header and body. The structure is as follows:

HTTP/version-number indicates the HTTP protocol version. For details about status-code and message, see [Status Code.

We use Fiddler to capture the response of a blog homepage and analyze its structure. Under the Inspectors tab, we can see the complete response message in RAW mode, as shown in figure

 

 

Difference between get and post Methods

The HTTP protocol defines many methods to interact with the server. There are four basic methods: Get, post, put, and delete. a URL address is used to describe resources on a network. The get, post, put, and delete operations in HTTP correspond to four operations to query, modify, add, and delete resources. The most common ones are get and post. Get is generally used to obtain/query resource information, while post is used to update resource information.

Let's look at the difference between get and post.

1. The data submitted by get will be placed after the URL? Splits the URL and transmits data. parameters are connected with each other, for example, editposts. aspx? Name = test1 & id = 123456. The post method places the submitted data in the body of the http package.

2. The size of the data submitted by get is limited (because the browser has a limit on the URL length), but there is no limit on the data submitted by the POST method.

3. You need to use request. querystring to obtain the value of the variable in get mode, while request. form is used to obtain the value of the variable in post mode.

4. if you submit data in get mode, security issues may occur. For example, when you submit data in get mode on a login page, the user name and password will appear on the URL, if the page can be cached or accessed by others, you can obtain the user's account and password from the history.

 

Status Code

The first line in a response message is called a status line, which consists of three parts: HTTP Protocol version number, status code, and status message.

The status code is used to tell the HTTP client whether the HTTP server has produced the expected response.

HTTP/1.1 defines five types of status codes, which are composed of three digits. the first digit defines the category of the response.

1xx prompt message-indicates that the request has been successfully received and continues Processing

2XX success-indicates that the request has been successfully received, understood, and accepted

3xx redirection-further processing is required to complete the request

4xx client error-request syntax error or request cannot be implemented

5xx server-side error-the server fails to implement valid requests

 

Look at some common status codes

200 OK

The most common is the successful response status code 200, which indicates that the request is successfully completed and the requested resource is sent back to the client.

For example, open the blog Home Page

 

302 found

Redirection, the new URL will be returned in the location in response, and the browser will use the new URL to send a new requset

For example, input in IE, The http://www.google.com. HTTP server returns 304, ie gets the new URL of Location header in response, and resends a request.

 

304 not modified

Indicates that the previous document has been cached and can be used again,

For example, when I open the blog homepage, I find that many response status codes are 304.

Tip: if you do not want to use the local cache, press Ctrl + F5 to force the page to be refreshed.

 

400 bad request client request and syntax errors, which cannot be understood by the server

403 the forbidden server received the request but refused to provide the service.

404 not found

The requested resource does not exist (the URL is incorrect)

For example, enter a wrong URL in IE, http://www.cnblogs.com/tesdf.aspx

 

500 an unexpected error occurred on the internal server error Server

503 the server unavailable server cannot process client requests currently and may return to normal after a period of time

 

HTTP Request Header

Use Fiddler to conveniently view the reques header. Click inspectors tab> request tab> headers, as shown in.

There are many headers, which are hard to remember. We also classify headers as fiddler, which is clear and easy to remember.

Cache header domain

If-modified-since

Purpose: Send the last modification time of the browser cache page to the server. The server compares the modification time with the last modification time of the actual file on the server. If the time is the same, 304 is returned, and the client directly uses the local cache file. If the time is different, 200 and the new file content are returned. The client discards the old file, caches the new file, and displays it in the browser.

Example: If-modified-since: Thu, 09 Feb 2012 09:07:57 GMT

Example

 

If-None-match

Purpose: If-None-match and etag work together. The working principle is to add etag information in HTTP response. When the user requests the resource again, the IF-None-match information (etag value) will be added to the HTTP request ). If the etag of the server authentication resource is not changed (the resource is not updated), a 304 status will be returned to tell the client to use the local cache file. Otherwise, the 200 status and new resources and etag will be returned. Using this mechanism will improve the website performance.

For example, if-None-Match: "03f2b33c0bfc0: 0"

Example

 

Pragma

Purpose: prevent the page from being cached. In HTTP/1.1, it works exactly as well as cache-control: No-cache.

There is only one usage for pargma, for example: Pragma: No-Cache

Note: In HTTP/1.0, only pragema: No-cache is implemented, and cache-control is not implemented.

 

Cache-control

Role: This is a very important rule. This is used to specify the cache mechanism followed by response-request. The meaning of each instruction is as follows:

Cache-control: public can be cached by any cache ()

Cache-control: Private content is only cached in the private cache.

Cache-control: No-Cache all content will not be cached

There are other usage cases. I do not understand the meaning. Please refer to other documents.

 

Client header domain

Accept

Role: acceptable media types on the browser,

For example, accept: text/html indicates that the browser can accept text/html, which is also a common HTML document,

If the server cannot return text/HTML data, the server should return a 406 error (non acceptable)

Wildcard * represents any type

For example, accept: */* indicates that the browser can process all types of data. (generally, this is what the browser sends to the server)

 

Accept-encoding:

Purpose: The browser declares the encoding method it receives. It usually specifies the compression method, whether compression is supported, and what compression method (gzip and deflate) is supported. (Note: This is not only character encoding );

Example: Accept-encoding: gzip, deflate

 

Accept-Language

Purpose: The browser declares the language it receives.

Differences between a language and a character set: Chinese is a language, and Chinese has multiple character sets, such as big5, gb2312, and GBK;

Example: Accept-language: En-US

 

User-Agent

Purpose: Tell the HTTP server the name and version of the operating system and browser used by the client.

When we log on to the forum online, we will often see some welcome information, which lists the names and versions of your operating system, the names and versions of your browsers, this often makes many people feel amazing. In fact, the server application obtains the information from the User-Agent Request Header domain, which allows the client to tell the server its operating system, browser, and other attributes.

Example: User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; CBA ;. net CLR 2.0.50727 ;. net CLR 3.0.20.6.2152 ;. net CLR 3.5.30729 ;. net4.0c; infopath.2 ;. net4.0e)

 

Accept-charset

Purpose: The browser declares the character set it receives. This is the various character sets and character encoding described earlier in this article, such as gb2312 and UTF-8 (we generally say charset includes the corresponding character encoding scheme );

For example:

 

Cookie/login header domain

COOKIE:

Role: The most important header that sends the cookie value to the HTTP server.

Entity header field

Content-Length

Purpose: The length of the data sent to the HTTP server.

Example: Content-Length: 38

 

Content-Type

Purpose:

Example: Content-Type: Application/X-WWW-form-urlencoded

 

Miscellaneous header field

Referer:

Purpose: The server that provides the request context information to tell the server which link I came from, for example, from my homepage to a friend, his server can calculate from HTTP Referer how many users click the link on my homepage to visit his website every day.

Example: Referer: http://translate.google.cn /? Hl = ZH-CN & tab = WT

TRANSPORT header field

Connection

For example, connection: keep-alive when a webpage is opened, the TCP connection between the client and the server for transmitting HTTP data will not be closed. If the client accesses the webpage on this server again, will continue to use this established connection

For example, connection: Close indicates that after a request is completed, the TCP connection used to transmit HTTP data between the client and the server is closed. When the client sends the request again, a TCP connection needs to be established again.

 

Host (this header field is required when a request is sent)

Purpose: The request header field is used to specify the Internet host and port number of the requested resource. It is usually extracted from the HTTP URL

For example, we enter: http://www.guet.edu.cn/index.html in the browser

The request message sent by the Browser contains the host Request Header domain, as follows:

HOST: http://www.guet.edu.cn

The default port number 80 is used here. If the port number is specified, it is changed to: Host: Specifies the port number.

 

HTTP Response Header

Use Fiddler to view the response header and click inspectors tab> Response Tab> headers, as shown in

We also classify headers as fiddler, which is clear and easy to remember.

Cache header domain

Date

Purpose: specify the time and date when the message is generated.

Example: Date: sat, 11 Feb 2012 11:35:14 GMT

 

Expires

Purpose: The browser uses the local cache within the specified expiration time.

Example: expires: Tue, 08 Feb 2022 11:35:14 GMT

 

Vary

Purpose:

Example: vary: Accept-Encoding

 

Cookie/login header domain

P3p

Purpose: set the cookie for cross-origin access. This can solve the problem of cross-origin access cookie for IFRAME.

Example: p3p: Cp = Cura ADMA Deva psao psdo our bus uni pur int DEM sta pre com nav OTC Noi DSP Cor

 

Set-Cookie

Function: a very important header used to send a cookie to the client browser. Each cookie written generates a set-Cookie.

Example: set-COOKIE: SC = 4c31523a; Path =/; domain = .acookie.taobao.com

 

Entity header field

Etag

Purpose: use it with if-None-match. (See the IF-None-match instance in this section)

Example: etag: "03f2b33c0bfcc0: 0"

 

Last-modified:

Purpose: indicates the last modification date and time of the resource. (See the IF-modified-since instance in the example)

Example: Last-modified: Wed, 21 Dec 2011 09:09:10 GMT

 

Content-Type

Purpose: The Web server informs the browser of the type and character set of the object to respond,

For example:

Content-Type: text/html; charset = UTF-8

Content-Type: text/html; charset = gb2312

Content-Type: image/JPEG

 

Content-Length

Specifies the length of the Object Body, expressed in decimal digits stored in bytes. In the process of data downlink, the Content-Length Method needs to cache all data on the server in advance, and then all the data is sent to the client.

Example: Content-Length: 19847

 

Content-Encoding

The Web server shows the compression method (gzip, deflate) used to compress the objects in the response.

Example: Content-encoding: Gzip

 

Content-language

Purpose: The Web server tells the browser the language of the response object.

For example, content-language: da

 

Miscellaneous header field

Server:

Purpose: Specify the software information of the HTTP server.

For example, server: Microsoft-IIS/7.5

 

X-ASPnet-version:

Purpose: If the website is developed using ASP. NET, this header is used to indicate the version of ASP. NET.

Example: X-ASPnet-version: 4.0.30319

X-powered-:

Purpose: indicates the technology used for website development.

Example: X-powered-by: ASP. NET

TRANSPORT header field

Connection

For example, connection: keep-alive when a webpage is opened, the TCP connection between the client and the server for transmitting HTTP data will not be closed. If the client accesses the webpage on this server again, will continue to use this established connection

For example, connection: Close indicates that after a request is completed, the TCP connection used to transmit HTTP data between the client and the server is closed. When the client sends the request again, a TCP connection needs to be established again.

Location header field

Location

Role: Used to redirect a new location, including a new URL address

For more information about instances, see 304 status instances.

 

The difference between the stateless HTTP protocol and connection: keep-alive

Stateless means that the Protocol has no memory for transaction processing, and the server does not know the client status. On the other hand, there is no connection between opening a webpage on a server and the webpage on the server you opened before.

HTTP is a stateless connection-oriented protocol. Stateless does not mean that HTTP cannot maintain a TCP connection, nor does it mean that HTTP uses a UDP Protocol (No connection)

Starting from HTTP/1.1, keep-alive is enabled by default to maintain the connection feature. To put it simply, after a webpage is opened, the TCP connection between the client and the server for transmitting HTTP data will not be closed. If the client accesses the webpage on the server again, it will continue to use this established connection.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.