Author: Ruheng
Address: http://www.jianshu.com/p/c1d6a294d3c0
In this paper, the HTTP request and response process to explain the relevant knowledge points involved.
First
HTTP requests and corresponding steps
Picture from: Understanding HTTP requests and Responses http://android.jobbole.com/85218/
The above fully represents the 7 steps of the HTTP request and response, and the following is an understanding of how HTTP requests and responses are delivered from the perspective of the TCP/IP protocol model.
Second
TCP/IP protocol
The TCP/IP protocol model (Transmission Control protocol/internet Protocol), which contains a series of network protocols that make up the Internet Foundation, is the core protocol of the Internet, and has matured through more than 20 years of development, And is widely used in LAN and WAN, has become a de facto international standard. TCP/IP protocol cluster is a group of multiple protocols at different levels and is usually considered as a four-layer protocol system, which corresponds to the seven-layer model of OSI.
The HTTP protocol is based on TCP/IP protocol model to transmit information.
(1). LINK Layer
Also known as the data Link layer or network interface layer (in the first diagram is the network interface layer and the hardware layer), typically includes the device driver in the operating system and the corresponding network interface card in the computer. Together they handle details of the physical interface with the cable (or any other transmission medium). ARP (Address Resolution Protocol) and RARP (inverse Address Resolution Protocol) are special protocols used by some network interfaces, such as Ethernet and Token Ring network, to transform the addresses used by the IP layer and the network interface layer.
(2). Network layer
Also known as the Internet layer (in the first diagram for the Internet layer), processes grouped in the network, such as the selection of groups. In the TCP/IP protocol family, the network layer protocol includes IP Protocol (Internet Protocol), ICMP protocol Internet Control Message Protocol, and IGMP Protocol (Internet Group Management Protocol).
IP is a network layer protocol that provides an unreliable service that simply sends packets from the source node to the destination node as quickly as possible, but does not provide any assurance of reliability. Used by both TCP and UDP. Each set of data for TCP and UDP is transmitted over the Internet through the IP layer in the End-to-end system and each intermediate router.
ICMP is a satellite protocol of the IP protocol. The IP layer uses it to exchange error messages and other important information with other hosts or routers.
IGMP is an Internet Group Management protocol. It is used to multicast a UDP datagram to multiple hosts.
(3). Transport Layer
Provides end-to-end communication primarily for applications on two host computers. In the TCP/IP protocol family, there are two distinct transport protocols: TCP (Transmission Control Protocol) and UDP (User Datagram Protocol).
TCP provides high reliability data communication for two hosts. The work it does involves dividing the data that the application gives to it into appropriate chunks to the network layer below, confirming the packets received, setting the timeout clock for sending the last confirmed packet, and so on. Because the transport layer provides high reliability end-to-end communication, the application layer can ignore all of these details. In order to provide a reliable service, TCP uses a mechanism such as timeout retransmission, sending and receiving end-to-end validation groupings.
UDP provides a very simple service for the application tier. It simply sends packets called datagrams from one host to another, but does not guarantee that the datagram will reach the other end. A datagram is an information unit that is transmitted from the sender to the receiver (for example, a certain number of bytes of information specified by the sender). Any required reliability of the UDP protocol must be provided by the application tier.
(4). Application Layer
The application layer determines the activity that communicates when the application service is provided to the user. In the TCP/IP protocol family, all kinds of common application services are stored. Includes http,ftp (file Transfer Protocol, Files Transfer Protocol), DNS (Domain Name System, field name systems) services.
When the application transmits data using TCP, the data is sent to the protocol stack, and each layer is passed through each tier until it is sent to the network as a stream of bits. Each of these layers adds header information (and sometimes tail information) to the data that is received, as shown in the figure.
When the destination host receives an Ethernet data frame, the data starts to rise from the bottom of the stack and removes the header of each layer protocol. Each protocol box examines the protocol identifier in the header of the message to determine the upper layer protocol that receives the data. This process is called Division (demultiplexing). The protocol is unpack by destination port number, source I p address, and source port number.
From the point of view of the TCP/IP model, we understand the process of HTTP request and response.
The following diagram is a clearer picture:
Here is a detailed look at how to do the step-by-step operation.
Third
TCP Three-time handshake
TCP is connection-oriented, and no matter which direction the other party sends the data, it must first establish a connection between the two parties. In the TCP/IP protocol, the TCP protocol provides a reliable connection service, which is initialized by a three-time handshake. The purpose of the three handshake is to synchronize the serial number and confirmation number of both sides of the connection and to exchange the TCP window size information.
First handshake: Establish a connection. The client sends the connection request message segment, the SYN position is 1,sequence number x, then the client enters the Syn_send state, waits for the confirmation of the server;
Second handshake: The server receives the SYN message segment. The server receives the client's SYN segment, needs to confirm the SYN segment, sets acknowledgment number to x+1 (Sequence number+1), and sends the SYN request information to the SYN location of 1, Sequence number is Y; server-side put all of the above information into a message segment (that is, Syn+ack message segment), a concurrent to the client, when the server into the SYN_RECV state;
Third handshake: The client receives the Syn+ack message segment of the server. Then the acknowledgment number is set to Y+1, the ACK message segment is sent to the server, and after the message segment is sent, the client and server end are in the established state and the TCP handshake is completed three times.
Why do you shake hands three times?
An error is generated in order to prevent a failed connection request message segment from suddenly being routed to the server.
Specific example: "Invalid connection request segment" generated in such a case: the client issued the first connection request message segment is not lost, but in a network node for a long time, so that delays to the connection after the release of a certain time to reach the server. Originally this is a long defunct message segment. However, when the server receives this invalid connection request message segment, it is mistaken for a new connection request issued by the client. A confirmation message is sent to the client, agreeing to establish the connection. Assuming that no "three handshake" is used, the new connection is established as soon as the server issues a confirmation. Because the client does not now make a request to establish a connection, it ignores the server's acknowledgement and does not send data to the server. But the server thought the new transport connection had been established and waited for the client to send the data. In this way, many of the server's resources are wasted. The use of the "three-time handshake" approach can prevent this phenomenon from happening. For example, in that case, the client does not issue a confirmation to the server's confirmation. The server was unaware that the client did not require a connection to be established because it was not receiving confirmation. ”
Fourth
HTTP protocol
What HTTP is.
Generally speaking, he is the computer through the network of the Rules of communication, is based on the request and response, stateless, application layer protocol, often based on TCP/IP protocol transmission data. At present any terminal (mobile phone, notebook computer). Any communication must be made in accordance with the HTTP protocol, otherwise it cannot be connected.
Four are based on:
Request and Response: Client sends request, server-side response data
Not in a state: protocol for transaction Processing no memory, the first time the client to establish a connection to the server to send a series of security authentication matching, etc., so as to increase the page wait time, when the client to the server to send the request, server-side response completed, the two disconnected, do not save the connection state, a break. Grace breaks righteousness. From then on passers-by. The next time a client sends a request to the same server, it needs to re-establish the connection because they have forgotten each other before.
Application layer: HTTP is a protocol that belongs to the application layer and is used in conjunction with TCP/IP.
Tcp/ip:http uses TCP as its support transport protocol. The HTTP client initiates a TCP connection to the server, and once the connection is established, the browser (client) and server processes can access TCP through the socket interface.
Some resolution strategies for stateless:
It is sometimes necessary to save a user's previous HTTP communication status, such as performing a login operation, and all requests do not need to be logged in for 30 minutes. So the cookie technology is introduced.
http/1.1 came up with a persistent connection (HTTP keep-alive) method. The feature is that, as long as either end does not explicitly propose a disconnect, the TCP connection state is maintained, and the connection:keep-alive in the Request header field indicates that a persistent connection is used.
Wait, there's a lot more ...
The following begins with the play: HTTP request messages, response messages, corresponding to the 2,3,4,5,6 of the above steps.
HTTP messages are text-oriented, and each field in a message is an ASCII string, and the length of each field is indeterminate. HTTP has two kinds of messages: Request message and Response message.
Fifth
HTTP request message
An HTTP request message consists of 4 parts of the request line, the request header (header), the blank line and the request data, and the following figure gives the general format of the request message.
1. Request Line
The request line is divided into three parts: the request method, the request address, and the Protocol version
Request method
There are 8 types of request methods defined by http/1.1: Get, POST, put, DELETE, PATCH, head, OPTIONS, TRACE.
The two most common get and post, if the RESTful interface is usually used to get, POST, DELETE, put.
Request Address
URL: A Uniform Resource Locator, an abstract unique recognition method for a voluntary location.
Compose:< protocol >://< host >:< port >/< path >
Ports and paths can sometimes be omitted (the HTTP default port number is 80)
The following example:
Sometimes with parameters, get requests
Protocol version
The format of the protocol version is: http/major version number. Minor version number, commonly used with http/1.0 and http/1.1
2. Request Head
The request header adds some additional information to the request message, consisting of a "name/value" pair, with a colon delimited between each row and the name and the value.
Common request headers are as follows:
The end of the request header will have a blank line representing the end of the request header, which is important and essential for requesting data.
3. Request data
Optional parts, such as GET requests, do not request data.
The following is a request message for a POST method:
post/index.php http/1.1 Request Line
Host:localhost
user-agent:mozilla/5.0 (Windows NT 5.1; rv:10.0.2) gecko/20100101 firefox/10.0.2 request Header
accept:text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
accept-language:zh-cn,zh;q=0.5
Accept-encoding:gzip, deflate
Connection:keep-alive
referer:http://localhost/
Content-length:25
content-type:application/x-www-form-urlencoded
Blank Line
username=aa&password=1234 Request Data
Sixth
HTTP Response message
The HTTP response message consists mainly of the status line, the response head, the blank line, and the response data.
1. Status line
Composed of 3 parts, respectively: protocol version, Status code, status code description.
The protocol version is consistent with the request message, and the status code description is a simple description of the status code, so the status code is only introduced here.
Status code
The status code is 3 digits.
1XX: Indicates information--indicates that the request has been received and continues processing.
2XX: Success-Indicates that the request has been successfully received, understood, and accepted.
3XX: Redirect-further action is required to complete the request.
4XX: Client Error-The request has a syntax error or the request cannot be implemented.
5XX: Server-side error-the server failed to implement a legitimate request.
Here are a few common examples:
2. Response to head
Similar to the request header, additional information is added to the response message
Common responses to the head are as follows:
3. Response data
For storing data information that needs to be returned to the client.
Here is an example of a response message:
http/1.1 OK status line
Date:sun, Mar 2013 08:12:54 GMT response Head
server:apache/2.2.8 (WIN32) php/5.2.5
x-powered-by:php/5.2.5
SET-COOKIE:PHPSESSID=C0HUQ7PDKMM5GG6OSOE3MGJMM3; path=/
Expires:thu, Nov 1981 08:52:00 GMT
Cache-control:no-store, No-cache, Must-revalidate, post-check=0, pre-check=0
Pragma:no-cache
content-length:4393
Keep-alive:timeout=5, max=100
Connection:keep-alive
content-type:text/html; Charset=utf-8
Blank Line
<title>http Response Sample <title>
<body>
Hello http!
</body>
There's a lot of knowledge about requesting headers and responding to the head, and here's just a brief introduction.
With the above steps, the data has been passed, the http/1.1 will maintain a persistent connection, but the connection is always closed for a period of time, and a TCP connection is required.
Seventh
TCP four times Wave
When the client and the server through three handshake to establish a TCP connection, when the data transfer completed, it is definitely to disconnect the TCP connection AH. For TCP Disconnect, there is a mysterious "four breakup."
The first breakup: Host 1 (Can make the client, can also be the server side), set sequence number, send a fin message segment to host 2; At this time, host 1 into the fin_wait_1 state; This means that the host 1 has no data to send to the host 2;
Second breakup: Host 2 received the Host 1 sent fin message segment, to host 1 back to an ACK message segment, acknowledgment number for sequence number plus 1; Host 1 into the fin_wait_2 state; Host 2 tell the host 1, I "agree" Your request for closure;
Third break up: Host 2 to host 1 send fin message segment, request close connection, at the same time host 2 into Last_ack state;
The fourth time break up: Host 1 receives the Host 2 sends the FIN message segment, sends the ACK message segment to the host 2, then the Host 1 enters the TIME_WAIT state, the host 2 receives the host 1 the ACK message segment, closes the connection; At this time, the host 1 waits for 2MSL still not receive the reply, Then the server end is turned off properly, and host 1 can also close the connection.
Why did you break up four times?
TCP protocol is a communication protocol of transport layer, which is connected, reliable and based on byte-throttling. TCP is a Full-duplex mode, which means that when host 1 emits a fin segment, only means that host 1 has no data to send, Host 1 tells the host 2, its data has been sent all over, but, this time host 1 can still accept data from the host 2, when the host 2 returns the ACK message segment, Indicates that it already knows that host 1 has no data to send, but host 2 can still send data to host 1; When host 2 also sent fin segment, this time means that Host 2 also has no data to send, will tell the host 1, I have no data to send, after each other will be happy to interrupt this TCP connection.
Through the above steps to complete the HTTP request and response, the data transmission, which involves the need for knowledge points, have been a one understanding.
All the HTTP knowledge you need to know is here. http://www.jianshu.com/p/a6d086a3997d
HTTP Knowledge Point Summary http://www.jianshu.com/p/2ecd288d27ad
Understanding HTTP requests and Responses http://android.jobbole.com/85218/
http-request, Response, cache Https://cnbin.github.io/blog/2016/02/20/http-qing-qiu-,-xiang-ying-,-huan-cun/
You should know the HTTP basics Http://www.jianshu.com/p/e544b7a76dac
Organizing HTTP Knowledge points http://www.jellythink.com/archives/705
A brief analysis of TCP's three handshake and four breakup http://www.jellythink.com/archives/705
HTTP request message and HTTP response message http://www.cnblogs.com/biyeymyhjob/archive/2012/07/28/2612910.html
TCP/IP protocol cluster layered detailed http://blog.csdn.net/hankscpp/article/details/8611229
HTTP request message and HTTP response message http://www.cnblogs.com/biyeymyhjob/archive/2012/07/28/2612910.html
E