Overview of basic HTTP protocol knowledge

Source: Internet
Author: User

HTTP (HyperText
Transfer Protocol is short for Hypertext Transfer Protocol. It is the core of Web applications. The HTTP protocol is implemented by two programs: a client program and a server program. They run on different end systems and communicate with each other by exchanging HTTP packets.

A Web page is also called a document that consists of objects. Objects are simply files. Files (such as HTML files and JPEG image files) can be addressed through a URL. Most Web pages contain a basic HTML file and several reference objects. If a page contains an HTML file and five JPEG image files, there are six objects on the page, five of which are referenced through the URL address. The Web browser implements the HTTP client. The Web server is used to store Web objects. Each object is addressed by a URL.

HTTP uses TCP as the transport layer protocol. The HTTP client initiates a TCP connection to the server. Once a connection is established, the browser and server process can access TCP through the socket interface. The client-side socket interface is the door between the client process and the TCP connection, and the server-side socket interface is the door between the server process and the TCP connection. The client sends HTTP request messages and receives HTTP response messages from the socket interface. The server also receives HTTP request messages and sends HTTP response messages from the socket interface. When the server sends the requested file to the client, it does not store any information about the client. If a specific client requests the same object twice in just a few seconds, the server will not resend the object because it has just provided the object to the user, just as the server has forgotten what it has done before. Therefore, stateless protocol is used in HTTP ).

Persistent
Connection): the same client uses the same TCP connection for all requests from the server and all responses from the server to the client. non-persistent connection ): the client and server establish a new TCP connection for each request/response pair.

1. non-persistent connections

Transfer a webpage step from the server to the client. Assume that the page contains a basic HTML file and 10 JPEG images, and these 11 objects are on the same server. The URL of the HTML file is:

Http://www.somesite.com/someresource/index.aspx.

1) The HTTP client process initiates a TCP connection to www.somesite.com on port 80, which is the default port of HTTP. There is a socket associated with the connection on the client and the server respectively.

2) The HTTP client sends an HTTP request message to the server through its socket. The request message contains the path name someresource/index. aspx.

3) the HTTP server often receives the request message through its socket and retrieves the object index from its memory. aspx encapsulates the object in an HTTP response packet and sends a response packet to the client through its socket.

4) the HTTP server process notifies TCP to disconnect the TCP connection. (However, TCP does not actually interrupt the connection until it confirms that the client has fully received the response message .)

5) The HTTP client receives the Response Message and closes the connection over TCP. The message encapsulates an HTML file. The client extracts the file from the response message to obtain 10 JPEG images.

6) Repeat the previous four steps for each referenced JPEG image object.

Each TCP Connection established between the browser and the Web server involves the TCP three-way handshake process and Round-Trip Time (Round-Trip Time, RTT, the time taken by a small group from the client to the server to the client ). Three-way handshake process: the client sends a small TCP packet segment to the server, the server uses a small TCP packet segment for confirmation and response, and finally, the client returns confirmation to the server, so far the three-way handshake is completed. An RTT is equal to the time spent in the first two parts of the three-way handshake. After completing the first two parts of the three-way handshake, the client combines the third handshake (confirmed by the client) with an HTTP request packet and sends it to the TCP connection.

Persistent connection:

In the case of persistent connections, the server keeps the TCP connection open after sending a response. Subsequent requests between the same client and server are very responsive and can be transmitted through the same connection. A complete web page (such as a basic HTML file with 10 graphics files) can be transmitted using a single persistent TCP connection. If a connection has not been used after a certain interval (a configurable timeout interval), the HTTP server closes the connection.

2. HTTP message format

To check the format of the HTTP request message, I randomly created a project using vs2010, and then tracked the message using tcptrace.

The content of webform1.aspx is as follows:

<div>
Test Page 1:
<asp:Button ID="test" runat="server" Text="Test" onclick="test_Click" />
<asp:Image ID="img" runat="server" ImageUrl="~/Tulips.jpg" />
</div>

When you request webform1.aspx, the request header file is as follows:

Get
/Webform1.aspx HTTP/1.1

Accept:
Application/X-MS-application, image/JPEG, application/XAML + XML, image/GIF,
Image/pjpeg, application/X-MS-xbap ,*/*

Referer:

Http: // localhost: 8080/

Accept-language:
Zh-CN

User-Agent:
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; wow64; Trident/4.0; slcc2;
. Net CLR 2.0.50727;. Net CLR 3.5.30729;. Net CLR 3.0.30729; Media Center PC
6.0;. net4.0c;. net4.0e; infopath.3)

Accept-Encoding:
Gzip, deflate

Host:
Localhost: 8080

Connection:
Keep-Alive

"GET/WebForm1.aspx
HTTP/1.1 "is called the request line format: Method Field, URL field, HTTP request version field. The methods include GET, POST, HEAD, PUT, and DELETE.

Accept: the acceptable media format of the client.

Referer: URL of the linked document

Accept-Language: Language acceptable to the client

User-Agent: browser type of the request sent to the server (Microsoft IE also uses Mozilla/4.0)

Accept-Encoding: the Encoding method that the client can process

Host: Host and port number of the Client

Connection: the Connection is closed or the Connection is maintained.

Response packet header file:

HTTP/1.1
200 OK

Server:
ASP. NET Development Server/10.0.0.0

Date:
Sat, 14 May 2011 06:43:17 GMT

X-AspNet-Version:
4.0.30319

Cache-Control:
Private

Content-Type:
Text/html; charset = UTF-8

Content-Length:
812

Connection:
Close

"HTTP/1.1 200 OK": the request is successful. Some common status codes include:

100: continue. The start part of the request has been received. The client can continue the request.

201: created the new URL is created

301: moved
The object of the permanently request has been permanently transferred. The new URL is defined in the LOCATION of the Response Message: specified in the first line. The client software automatically obtains the object with a new URL.

302: moved
The URL requested by temporarily has been temporarily removed

400: bad request syntax error in the request

401: unauthorized
The request lacks proper permissions.

403: The forbidden service is rejected.

404: not found document not found

500: internal
Server error Internal server error

503: service
The unavailable service is temporarily unavailable.

Server: the Server name and version number of the Response Request.

Date: the Date and time when the server generates and responds to messages.

Cache-Control: high-speed Cache information

Content-Type: Document Type of the server response

Content-Length: number of bytes of the Server Response Document Type

Because there is an image in the test page, a TCP connection is required. The HTTP header file of the Request image is as follows:

GET/tulips.jpg
HTTP/1.1

Accept:
*/*

Referer:

Http: // localhost: 8080/webform1.aspx

Accept-language:
Zh-CN

User-Agent:
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; wow64; Trident/4.0; slcc2;
. Net CLR 2.0.50727;. Net CLR 3.5.30729;. Net CLR 3.0.30729; Media Center PC
6.0;. net4.0c;. net4.0e; infopath.3)

Accept-encoding:
Gzip, deflate

Host:
Localhost: 8080

Connection:
Keep-alive

Response header file:

HTTP/1.1
200 OK

Server:
ASP. NET development server/10.0.0.0

Date:
Sat, 14 May 2011 06:43:17 GMT

X-ASPnet-version:
4.0.30319

Cache-Control:
Private

Content-Type:
Image/jpeg

Content-Length:
620888

Connection:
Close

1. User-server interaction: cookie

The HTTP server is stateless. This simplifies the server design and allows engineers to develop high-performance Web servers that simultaneously process thousands of TCP connections. However, the Web site wants to identify users, restrict permissions, or track user behaviors to realize business value. HTTP uses cookies to meet this requirement. When a user accesses a Web server, the server sets the cookie value and then encapsulates it in the response header file to send the client browser. In this case, the response header file displays the set-cookie item. The browser adds a line to the special cookie file it manages, including the Host Name of the server and the value in set-cookie. When the user accesses the server again, the browser will read the cookie value of the server and encapsulate it into the request header file and send it to the server. Use the test site and tcpTrace to view the HTTP header file containing cookies.

Add the background response code:

      protected void test_Click(object sender, EventArgs e)
{
HttpCookie cookie = new HttpCookie("UserName");
cookie.Value = "Test";
Response.Cookies.Add(cookie);
Response.Redirect("WebForm2.aspx");
}

When you click the button, write the cookie and go to page 2 to read the cookie value in page 2.

Protected void Page_Load (object sender, EventArgs e)
{
Response. Write (string. Format ("UserName: {0}", Request. Cookies ["UserName"]. Value ));
}

After you click the test button on page 1, the server responds to the header file:

Server:
ASP. NET Development Server/10.0.0.0

Date:
Sat, 14 May 2011 07:31:35 GMT

X-AspNet-Version:
4.0.30319

Location:
/WebForm2.aspx

Set-Cookie:
UserName = Test; path =/

Cache-Control:
Private

Content-Type:
Text/html; charset = UTF-8

Content-Length:
131

Connection:
Close

The Set-Cookie item appears.

HTTP header of request page 2:

GET
/WebForm2.aspx HTTP/1.1

Accept:
Application/x-ms-application, image/jpeg, application/xaml + xml, image/gif,
Image/pjpeg, application/x-ms-xbap ,*/*

Referer:

Http: // localhost: 8080/WebForm1.aspx

Accept-Language:
Zh-CN

User-Agent:
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2;
. Net clr 2.0.50727;. net clr 3.5.30729;. net clr 3.0.30729; Media Center PC
6.0;. NET4.0C;. NET4.0E; InfoPath.3)

Cookie:
UserName = Test

Accept-Encoding:
Gzip, deflate

Host:
Localhost: 8080

Connection:
Keep-Alive

Cache-Control:
No-cache

The header file contains "Cookie:
UserName = Test ".

This article is the Reading Notes of "Computer Network-top-down method". I think the author has made it vivid and easy to understand, so I can take the notes for reference later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.