HTTP Proxy HTTP protocol detailed

Source: Internet
Author: User
Tags send cookies time and date

HTTP Proxy HTTP protocol detailed2014-01-03 23:36 Source: Proxy IP Resource Network click: 1675 times

Today's Web program development technology is really a contention, ASP, PHP, Jsp,perl, AJAX and so on. Regardless of how web technologies evolve in the future, it is important to understand the basic protocols for communicating between web programs, because it allows us to understand the internal work of Web applications. This article will be a detailed example of the HTTP protocol to explain, more content, I hope you have patience to see. We also hope that we can help you with your development work or test work. It is very easy to capture HTTP request and HTTP Response using the Fiddler tool.

  Read Catalogue

    1. What is the HTTP protocol
    2. Web server, browser, proxy server
    3. URL detailed
    4. The HTTP protocol is stateless
    5. Structure of the HTTP message
    6. The difference between the Get and post methods
    7. Status code
    8. HTTP Request Header
    9. HTTP Response Header
    10. The HTTP protocol is a stateless and connection:keep-alive difference
What is the HTTP protocol

Protocol refers to the rules or rules that must be adhered to in communication between two computers in a computer communication network, and Hypertext Transfer Protocol (HTTP) is a communication protocol that allows Hypertext Markup Language (HTML) documents to be routed from a Web server to a client's browser

We are currently using the http/1.1 version

Web server, browser, proxy server

When we open the browser, enter the URL in the Address bar, and then we see the page. What is the principle?

In fact, after we enter the URL, our browser sends a request to the Web server, the Web server receives the request, processes it, generates the corresponding response, sends it to the browser, and the browser parses the HTML in the response. So we see the Web page, as shown in the process

It is possible that our request was passed through a proxy server and finally arrived at the Web server.

The process is as shown

Proxy Server is a transit point of network information, what is the function?

1. Improve access speed, most of the proxy server has the cache function.

2. Break the limit, that's FQ.

3. Hide identities.

URL detailed

The URL (Uniform Resource Locator) address is used to describe a resource on a network with the following basic format

schema://host[:p ort#]/path/.../[;url-params][?query-string][#anchor]

Scheme specifies the protocol used by the lower layer (for example: HTTP, HTTPS, FTP)

The IP address or domain name of the host HTTP server

The default port for the port# HTTP server is 80, in which case the lower number can be omitted. If you use a different port, you must specify, for example, http://www.cnblogs.com:8080/

Path to access resource

Url-params

Query-string data sent to the HTTP server

anchor-Anchor

An example of a URL

Http://www.mywebsite.com/sj/test;id=8079?name=sviergn&x=true#stuff

Schema:http

Host:www.mywebsite.com

Path:/sj/test

URL params:id=8079

Query String:name=sviergn&x=true

Anchor:stuff

The HTTP protocol is stateless

HTTP protocol is stateless, the same client's request and the last request is not the corresponding relationship, for the HTTP server, it does not know that the two requests from the same client. To solve this problem, the Web program introduces a cookie mechanism to maintain state.

Structure of the HTTP message

First look at the structure of the request message, the request message is divided into 3 parts, the first part is called the request line, the second part is called the HTTP header, the third part is the body. There is a blank line between the header and the body, as the structure

The method in the first line represents the request methods, such as "POST", "GET", Path-to-resoure represents the requested resource, and Http/version-number represents the version number of the Http protocol

When the "GET" method is used, the body is empty

For example, we open the Blog Garden home page request as follows

GET http://www.cnblogs.com/HTTP/1.1

Host:www.cnblogs.com

We use Fiddler to capture a blog site login request and then analyze its structure, in the Inspectors tab under the raw way to see the complete request message, such as

Let's look at the structure of the response message, which is basically the same as the structure of the request message. Also divided into three parts, the first part is called Request line, the second part is called the request header, the third part is the body. There is also a blank line between the header and the body, as the structure

Http/version-number represents the version number of the HTTP protocol, Status-code and message, see the detailed explanation of the next section [Status code].

We use Fiddler to capture a blog home response then analyze its structure, in the Inspectors tab under the raw way can see the full response message, such as

The difference between the Get and post methods

The HTTP protocol defines a number of ways to interact with the server, the most basic of which are 4, get,post,put,delete, respectively. A URL address is used to describe a resource on a network, and the Get, POST, PUT, delete in HTTP corresponds to the search for this resource, change, increase, delete 4 operations. Our most common is get and post. Get is typically used to get/query resource information, and post is typically used to update resource information.

Let's look at the difference between get and post

1. Get submitted data will be placed after the URL, to split the URL and transfer data, the parameters are connected with &, such as editposts.aspx?name=test1&id=123456. The Post method is to put the submitted data in the body of the HTTP packet.

2. The data size of the Get commit is limited (because the browser has a limit on the length of the URL), and there is no limit to the data submitted by the Post method.

3. The Get method needs to use Request.QueryString to get the value of the variable, while the Post method obtains the value of the variable by Request.Form.

4. The Get method submits the data, which brings security issues, such as a login page, when the data is submitted by get, the user name and password will appear on the URL, if the page can be cached or other people can access the machine, you can obtain the user's account and password from the history.

Status code

The first line in the Response message is called the status line, which consists of the HTTP protocol version number, the status code, and the status message.

The status code is used to tell the HTTP client whether the HTTP server produced the expected response.

The 5 class status codes are defined in the http/1.1, and the status codes are made up of three digits, and the first number defines the category of the response

1XX hint Message-Indicates that the request was successfully received and continues processing

2XX Success-Indicates that the request has been successfully received, understood, accepted

3XX Redirect-further processing is required to complete the request

4XX Client Error-Request syntax error or request not implemented

5XX server-side error-the server failed to implement a legitimate request

Take a look at some common status codes

OK

The most common is the successful response status Code 200, which indicates that the request was successfully completed and the requested resource was sent back to the client

For example, open the blog Garden Home

302 Found

Redirect, the new URL will be returned in the location in response, and the browser will send a new request using the new URL.

For example, in IE enter http://w Ww.goog le.co m. The HTTP server returns 304, IE takes the new URL to the location header in response and sends a request again.

304 Not Modified

On behalf of the last document has been cached, you can continue to use,

For example, open the blog home page, found a lot of Response status code is 304

Tip: If you don't want to use a local cache, you can force the page to refresh with Ctrl+f5

Error request client requests and syntax errors cannot be understood by the server

403 Forbidden server receives request, but refuses to provide service

404 Not Found

The request resource does not exist (the wrong URL was lost)

For example, enter an incorrect URL in IE, http://www.cnblogs.com/tesdf.aspx

An unexpected error occurred on the Internal server error server

503 Server Unavailable Server is currently unable to process client requests and may return to normal after some time

HTTP Request Header

With Fiddler you can easily see the reques header, click Inspectors tab, Request tab, and headers as shown.

Header There are many, more difficult to remember, we also follow the Fiddler as the header classification, so clear and easy to remember.

  Cache header Field

If-modified-since

Effect: The last modification time of the browser-side cache page is sent to the server, and the server compares this time with the last modification time of the actual file on the server. If the time is the same, then return 304, the client uses the local cache file directly. If the time is inconsistent, 200 and the new file contents are returned. After the client receives it, it discards the old files, caches the new files, and displays them in the browser.

For example: If-modified-since:thu, 09:07:57 GMT

Real example

If-none-match

Role: If-none-match works with the ETag and works by adding etag information to the HTTP response. When the user requests the resource again, the If-none-match information (the value of the ETag) is added to the HTTP request. If the server verifies that the etag of the resource has not changed (the resource is not updated), it returns a 304 status that tells the client to use the local cache file.  Otherwise, the 200 state and the new resource and ETag are returned. Using such a mechanism will improve the performance of your website

Example: If-none-match: "03f2b33c0bfcc1:0"

Real example

Pragma

Role: Prevent the page from being cached, in the http/1.1 version, it is identical to the Cache-control:no-cache function

Pargma has only one usage, for example: Pragma:no-cache

Note: In the http/1.0 version, only Pragema:no-cache is implemented, not implemented Cache-control

Cache-control

Role: This is a very important rule. This is used to specify the caching mechanism that response-request follows. Each instruction has the following meanings

Cache-control:public can be cached by any cache ()

Cache-control:private content is cached only in the private cache

Cache-control:no-cache All content is not cached

There are other uses, I do not understand the meaning, please refer to other information

  Client Header Domain

Accept

Role: The type of media that can be accepted by the browser side,

For example: accept:text/html represents the type of server postback that the browser can accept as text/html, which is what we often call HTML documents,

If the server cannot return data of type text/html, the server should return a 406 error (non acceptable)

Wildcard * represents any type

For example, Accept: */* on behalf of the browser can handle all types, (the general browser to the server is the issue of this)

Accept-encoding:

Function: The browser declares itself to receive the encoding method, usually specifies the compression method, whether compression is supported, what compression method is supported (Gzip,deflate), (note: This is not a character encoding);

Example: Accept-encoding:gzip, deflate

Accept-language

Role: The browser affirms the language it receives.

Language and Character set differences: Chinese is a language, Chinese has a variety of character sets, such as BIG5,GB2312,GBK and so on;

Example: accept-language:en-us

User-agent

Role: tells the HTTP server which client uses the name and version of the operating system and browser.

When we go online to the forum, often see some welcome information, which lists the name and version of your operating system, the name and version of the browser you are using, which often makes a lot of people feel very magical, in fact, The server application obtains this information from the User-agent request header domain user-agent The request header domain allows the client to tell the server about its operating system, browser, and other properties.

For example: user-agent:mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; trident/4.0; CIBA;. NET CLR 2.0.50727;. NET CLR 3.0.4506.2152;. NET CLR 3.5.30729;. net4.0c; infopath.2;. NET4.0E)

Accept-charset

Role: The browser affirms its own received character set, this is the various character sets and character encodings described earlier in this article, such as gb2312,utf-8 (usually we say CharSet includes the corresponding character encoding scheme);

For example:

  Cookie/login header Field

Cookies:

Role: The most important header, the value of the cookie is sent to the HTTP server

  Entity header Field

Content-length

Role: The length of the data sent to the HTTP server.

Example: content-length:38

Content-type

Role:

Example: content-type:application/x-www-form-urlencoded

  Miscellaneous header Field

Referer:

Role: The server that provides the context information for the request tells the server which link I have received from, such as linking to a friend from my home page, and his server is able to count the number of users who clicked the link on my page every day from the HTTP referer to visit his website.

Example: REFERER:HTTP://TRANSLATE.GOOGLE.CN/?HL=ZH-CN&TAB=WT

  Transport header Field

Connection

Example: connection:keep-alive when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, it will continue to use the established connection

For example: Connection:close represents the completion of a request, the TCP connection between the client and the server for transmitting HTTP data is turned off, and the TCP connection needs to be re-established when the client sends the request again.

Host (the header field is required when the request is sent)

Role: The request header domain is used primarily to specify the Internet host and port number of the requested resource, which is typically extracted from the HTTP URL

For example: We entered in the browser: http://www.guet.edu.cn/index.html

In the request message sent by the browser, the host Request header field is included, as follows:

host:http://www.guet.edu.cn

The default port number 80 is used here, and if a port number is specified, it becomes: Host: Specify port number

HTTP Response Header

Also use Fiddler to view Response header, click Inspectors tab->response tab-> headers as shown

We also classify the header according to Fiddler, so that it is clearer and easier to remember.

  Cache header Field

Date

Role: The exact time and date of the message generation

Example: Date:sat, 11:35:14 GMT

Expires

Role: The browser will use the local cache for the specified expiration period

For example: Expires:tue, 2022 11:35:14 GMT

Vary

Role:

Example: vary:accept-encoding

  Cookie/login header Field

P3p

Role: Used to set cookies across domains, which resolves the issue of cross-domain access to cookies for IFRAME

Example: P3p:cp=cura ADMa DEVa Psao psdo our BUS UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR

Set-cookie

Role: A very important header, used to send cookies to the client browser, each write cookie generates a Set-cookie.

For example: set-cookie:sc=4c31523a; path=/; Domain=.acookie.taobao.com

  Entity header Field

ETag

Function: Used in conjunction with If-none-match. (See examples of If-none-match in the section)

For example: ETag: "03f2b33c0bfcc1:0"

Last-modified:

Role: Used to indicate the last modification date and time of the resource. (See examples of if-modified-since in the section)

Example: last-modified:wed, Dec 09:09:10 GMT

Content-type

Role: The Web server tells the browser the type and character set of the object it responds to.

For example:

content-type:text/html; Charset=utf-8

content-type:text/html;charset=gb2312

Content-type:image/jpeg

Content-length

Indicates the length of the entity body, expressed as a decimal number stored in bytes. In the process of data downlink, content-length the way to pre-cache all the data in the server, and then all the data peremptorily to the client.

Example: content-length:19847

Content-encoding

The Web server indicates what compression method (Gzip,deflate) It uses to compress the objects in the response.

Example: Content-encoding:gzip

Content-language

Role: The Web server tells the browser to respond to the language of the object

Example: Content-language:da

  Miscellaneous header Field

Server:

Function: Indicates the software information of the HTTP server

Example: server:microsoft-iis/7.5

X-aspnet-version:

Role: If the Web site is developed with ASP, this header is used to represent the version of ASP.

Example: x-aspnet-version:4.0.30319

X-powered-by:

Role: Indicates what technology the site is developed with

Example: X-powered-by:asp.net

  Transport header Field

Connection

Example: connection:keep-alive when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, it will continue to use the established connection

For example: Connection:close represents the completion of a request, the TCP connection between the client and the server for transmitting HTTP data is turned off, and the TCP connection needs to be re-established when the client sends the request again.

  Location Header Field

Location

Function: Used to redirect a new location, including a new URL address

For example, see 304 status instances

The HTTP protocol is a stateless and connection:keep-alive difference

Stateless means that the protocol has no memory capacity for transactions, and the server does not know what the client state is. On the other hand, there is no connection between opening a Web page on a server and the pages you have previously opened on this server.

HTTP is a stateless, connection-oriented protocol, and stateless does not mean that HTTP cannot maintain TCP connections, nor does it use the UDP protocol (no connection) on behalf of HTTP.

From http/1.1 onwards, the default is to open the keep-alive, to maintain the connection characteristics, in short, when a Web page opens, the client and server for the transmission of HTTP data between the TCP connection will not be closed, if the client again access to the Web page on this server, will continue to use this established connection.

Keep-alive does not permanently keep the connection, it has a hold time that can be set in different server software (such as Apache).

HTTP Proxy HTTP protocol detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.