In-depth understanding of HTTP protocol (I)-Basic concepts

Source: Internet
Author: User
Tags element groups
1. introduction HTTP is short for HyperTextTransferProtocol (Hypertext Transfer Protocol. Its development is the result of the collaboration between the world wide web Society (WorldWideWebConsortium) and the Internet team IETF (InternetEngineeringTaskForce). They finally published a series of RF 1. Introduction

HTTP is short for Hyper Text Transfer Protocol. Its development is the result of cooperation between the World Wide Web Consortium and the Internet team IETF (Internet Engineering Task Force). They finally published a series of RFC, RFC 1945 defines HTTP/1.0. The most famous one is RFC 2616. RFC 2616 defines a common version of HTTP 1.1.

HyperText Transfer Protocol (Hyper Text Transfer Protocol) is a Transfer Protocol used to Transfer HyperText from a WWW server to a local browser. It makes the browser more efficient and reduces network transmission. It not only ensures that the computer transfers hypertext documents correctly and quickly, but also determines which part of the transmitted documents and which part of the content is first displayed (such as text before graphics.

HTTP is an application layer protocol consisting of requests and responses. it is a standard client server model. HTTP is a stateless protocol.

2. location in the TCP/IP protocol stack

The HTTP protocol is usually carried over the TCP protocol, and sometimes on the TLS or SSL protocol layer. at this time, we often say HTTPS. As shown in:

The default HTTP port number is 80, and the HTTPS port number is 443.

3. HTTP request response model

The HTTP protocol always initiates a request from the client, and the server returns the response. See:

This restricts the use of the HTTP protocol and prevents the server from pushing messages to the client when the client does not initiate a request.

HTTP is a stateless protocol. this request on the same client does not correspond to the previous request.

4. workflow

An HTTP operation is called a transaction. the procedure can be divided into four steps:

1) first, the client and the server need to establish a connection. Click a hyperlink to start HTTP.

2) after a connection is established, the client sends a request to the server in the format of uniform resource identifier (URL), protocol version number, the MIME Information is followed by the request modifier, client information, and possible content.

3) after receiving the request, the server returns the corresponding response information in the format of a status line, including the protocol version number of the information, a successful or wrong code, MIME Information is followed by server information, Entity Information, and possible content.

4) the information returned by the client receiving server is displayed on the user's display screen through a browser, and the client is disconnected from the server.

If an error occurs in one of the preceding steps, the error message is returned to the client and displayed. For users, these processes are completed by HTTP. users only need to click and wait for the information to be displayed.

5. use Wireshark to capture TCP and http packets

Open Wireshark, select "Capture"-> "Options" on the toolbar, and select 1 on the interface:

Generally, you only need to select the top drop-down box, select the appropriate Device, and then click "Capture Filter". here, "http tcp port (80)" is selected )", click Start to capture packets.

For example, open http://image.baidu.com/in the browser, and capture the packet as shown in 3:

Http://www.blogjava.net/images/blogjava_net/amigoxie/40799/o_http%e5%8d%8f%e8% AE % AE %e5%ad%a6%e4%b9%a0-%e6%a6%82%e5%bf%b5-3.jpg

In, you can clearly see the interaction process between the client browser (ip address: 192.168.2.33) and the server:

1) No1: the browser (192.168.2.33) sends a connection request to the server (220.181.50.118. This is the first step of TCP three-way handshake. it can be seen that it is SYN, seq: X? (X = 0)

2) No2: the server (220.181.50.118) responded to the browser (192.168.2.33) request and asked for confirmation. the request is SYN and ACK. in this case, seq: y (y is 0), ACK: x + 1 (1 ). This is the second step of the three-way handshake;

3) No3: the browser (192.168.2.33) responded to the server (220.181.50.118) confirmation and the connection was successful. It is ACK. in this case, seq: x + 1 (1) and ACK: y + 1 (1 ). This is the third step of the three-way handshake;

4) No4: the browser (192.168.2.33) sends a page HTTP request;

5) No5: Server (220.181.50.118) confirmation;

6) No6: the server (220.181.50.118) sends data;

7) No7: check the client browser (192.168.2.33;

8) No14: the client (192.168.2.33) sends an image HTTP request;

9) No15: Server (220.181.50.118) sends status response code 200 OK

......

6. header fields

Each header field consists of a domain name, a colon (:), and a domain value. The domain name is case-insensitive. you can add any number of space characters before the domain value. The header field can be expanded to multiple rows. at least one space or tab character is used at the beginning of each line.

In the packet capture graph, you can see 4 at No14:

Http://www.blogjava.net/images/blogjava_net/amigoxie/40799/o_http%e5%8d%8f%e8% AE % AE %e5%ad%a6%e4%b9%a0-%e6%a6%82%e5%bf%b5-4.jpg

The response message 5 is shown below:

6.1? Host header domain

The Host header specifies the Intenet Host and port number of the requested resource, which must represent the location of the original server or gateway of the request url. The HTTP/1.1 request must contain the host header domain; otherwise, the system returns the status code 400.

Host behavior in:

6.2? Referer header field

The Referer header field allows the client to specify the source resource address of the request uri, which allows the server to generate a rollback linked list for login and cache optimization. He also allows abolished or erroneous connections to be tracked for maintenance purposes. If the requested uri does not have its own uri address, the Referer cannot be sent. If some uri addresses are specified, this address is a relative address.

In, the Referer row content is:

6.3? User-Agent header domain

The content of the User-Agent header contains the User information that sends the request.

In, the content of the User-Agent line is:

Http://www.blogjava.net/images/blogjava_net/amigoxie/40799/o_http%e5%8d%8f%e8% AE % AE %e5%ad%a6%e4%b9%a0-%e6%a6%82%e5%bf%b5-8.jpg

6.4? Cache-Control header field

Cache-Control specifies the Cache mechanism that requests and responses follow. Setting Cache-Control in a request message or response message does not modify the Cache processing process of another message. The cache commands in the request include no-cache, no-store, max-age, max-stale, min-fresh, only-if-cached, commands in the response message include public, private, no-cache, no-store, no-transform, must-revalidate, proxy-revalidate, and max-age.

The header field in is:

6.5? Date header field

The Date header field indicates the message sending time. the description format of the time is defined by rfc822. For example, Date: Mon, 31Dec200104: 25: 57GMT. The time described in Date indicates the world standard time. you need to know the time zone of the user to convert the local time.

Shows the header field:

7. important concepts of HTTP 7.1 Connection: Connection

The actual circulation of a transport layer is established between two applications that communicate with each other.

In http1.1, a connection header may appear in the request and reponse headers. this header indicates how to process long links when the client and server communicate.

In http1.1, both the client and server support persistent connections by default ,? If the client uses the http1.1 protocol but does not want to use persistent connections, you must specify the connection value in the header as close. if the server does not want to support persistent connections, in response, you must specify that the connection value is close. Whether the request or response header contains a connection with the value of close, it indicates that the currently used tcp link will be broken after the request is processed on the current day. In the future, when the client initiates a new request, a new tcp link must be created.

7.2 Message: Message

The basic unit of HTTP communication, including a structured sequence of eight-element groups that are transmitted over connections.

7.3 Request: Request

The request information from the client to the server includes the method applied to the resource, the resource identifier, and the protocol version number.

7.4 Response: Response

The information returned by an slave server includes the HTTP protocol version number, the request status (for example, "successful" or "not found"), and the MIME type of the document.

7.5. Resource

Network data objects or services identified by URIs.

7.6 Entity: Entity

A special representation of a data resource or from a service resource, which may be contained in a request or response. An object includes object header information and object content.

7.7 Client: Client

An application that establishes a connection for the purpose of sending a request.

7.8 User agent: UserAgent

Initialize the client of a request. They are browsers, editors, or other user tools.

7.9 Server: Server

An application that accepts the connection and returns information to the request.

7.10 Source Server: Originserver

Is a server on which a given resource can reside or be created.

7.11 Proxy: Proxy

An intermediate program can act as a server or a client to create requests for other clients. Requests are transmitted to other servers through possible translation. A proxy must explain before sending the request information and rewrite it if possible.

A proxy is often used as a portal through a firewall client. a proxy can also be used as a help application to process requests that are not completed by a user proxy through a protocol.

7.12 Gateway: Gateway

A server serving as the intermediate media of other servers. Different from the proxy, the gateway accepts the request as if it is the source server for the requested resource; the client sending the request does not realize that it is dealing with the Gateway.

The gateway is often used as a portal for servers that use firewalls. The Gateway can also be used as a protocol translator to access resources stored in non-HTTP systems.

7.13 Channel: Tunnel

Is an intermediary for two connection relay. Once activated, the channel is considered not to belong to HTTP communication, although the channel may be initialized by an HTTP request. When the two ends of the relay connection are closed, the channel disappears. The channel is frequently used when a Portal must exist or Intermediary cannot interpret the relay communication.

7.14 Cache: Cache

Local storage of Response Information .?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.