What is the process of a complete HTTP transaction?

Source: Internet
Author: User
Tags ack nets

When we enter www.linux178.com in the address bar of the browser, and then go to the car, enter this moment to see the page in the end what happened.


The following procedure is only a personal understanding:


Domain name resolution--> initiate TCP's 3-time handshake--> The HTTP request is initiated after the TCP connection is established--> the server responds to the HTTP request, the browser gets the HTML code--> the browser parses the HTML code, and requests the resources in the HTML code (such as JS, CSS , pictures, and so on)--> the browser renders the page to the user


For the HTTP protocol, refer to the following:

A ramble on HTTP protocol http://kb.cnblogs.com/page/140611/

HTTP protocol Overview http://www.cnblogs.com/vamei/archive/2013/05/11/3069788.html

Understand all aspects of HTTP headers http://kb.cnblogs.com/page/55442/


The following is an analysis of the above process, we take the Chrome browser as an example:


1. Domain Name resolution


First, the Chrome browser resolves the IP address of the www.linux178.com domain name, which is exactly the name of the host. How to resolve to the corresponding IP address.


①chrome Browser will first search the browser's own DNS cache (cache time is relatively short, about 1 minutes, and can only accommodate 1000 cache), look at its own cache has www.linux178.com corresponding entries, and does not expire, If there is, and does not expire, resolves to this end.

Note: How do we look at Chrome's own cache. You can use the chrome://net-internals/#dns for viewing


② If the browser does not find a corresponding entry in its own cache, then chrome searches the operating system's own DNS cache and stops the search resolution if it finds and does not expire.

Note: How to view the DNS cache of the operating system itself, with Windows system as an example, you can use Ipconfig/displaydns at the command line to view


③ If the DNS cache is not found in the Windows system, try to read the Hosts file (located in C:\Windows\System32\drivers\etc) to see if there are any IP addresses for that domain name, and if so, parse successfully.


④ if a corresponding entry is not found in the Hosts file, the browser initiates a DNS system call to the local-configured preferred DNS server (typically provided by a telecommunications operator, You can also use a DNS server like Google to initiate a domain resolution request (a UDP protocol that initiates a request to DNS port 53, a recursive request that the operator's DNS server must provide us with the IP address of that domain), The operator's DNS server first looks for its own cache, finds the corresponding entry, and does not expire, the resolution succeeds. If the corresponding entry is not found, then there is the operator's DNS proxy for our browser to initiate an iterative DNS resolution request, it is first to find the root domain DNS IP address (this DNS server is built into 13 root domain DNS IP address), to find the root domain DNS address, Will request to it (please ask www.linux178.com this domain name IP address is how many ah.) , the root domain discovers this is a domain name of a top-level domain COM domain, therefore tells the operator DNS I do not know this domain name IP address, but I know the COM domain IP address, you go to find it, so the operator's DNS got the COM domain IP address, Also launched a request to the IP address of the COM domain (what is the IP address of www.linux178.com This domain name, please?), COM domain This server tells the operator DNS I don't know www.linux178.com this domain IP address, but I know linux178.com this domain DNS address, you go to find it, so the operator's DNS and to linux178.com the DNS address of this domain name (this is generally By the domain name registrar to provide, such as million nets, new nets, etc. to initiate the request (please www.linux178.com this domain name IP address is how much.) ), This time linux178.com domain DNS Server A check, eh, really here, so I found the results sent to the operator's DNS server, this time the operator's DNS server to get www.linux178.com this domain name corresponding to the IP address, and return to the Windows system kernel, The kernel returned the results to the browser, and finally the browser got the www.linux178.com corresponding IP address, this step of the action.


Note: In general, the following steps are not performed


If the above 4 steps have not been resolved successfully, the following steps are followed (for the Windows operating system):


The ⑤ operating system looks for the NetBIOS name cache (which is located on the client computer) and what is in the cache. The computer name and IP address of the computer that I have successfully communicated with in the recent period of time will be in this cache. Under what circumstances can this step be resolved successfully? This is the name exactly a few minutes ago and I successfully communicated, then this step can be successfully resolved.


⑥ If step ⑤ does not succeed, it queries the WINS server (is the server that corresponds to the NetBIOS name and IP address)


⑦ If the ⑥ step is not successful, then the client will be broadcast to find


⑧ If the ⑦ step is not successful, then the client will read the Lmhosts file (and the same directory as the Hosts file, the same writing)


If the eighth step has not been resolved successfully, then the resolution is declared unsuccessful, then the target computer can not communicate. As long as there is one step in these eight steps to resolve the success, you can successfully communicate with the target computer.


Take a look at the screenshot of the Image capture package:

Linux virtual machine test, using the command wget www.linux178.com to request, found that the direct use of Chrome browser request, interference request more, so use the wget command to request, However, using the wget command can only index.html request back, and will not be included in the index.html static resources (JS, CSS and other files) to request.


Grab Bag Analysis:


①, this is the virtual machine on the radio, To get 192.168.100.254 (that is, the gateway) MAC address, because the LAN communication depends on the MAC address, why it needs to communicate with the gateway because our DNS server IP is peripheral IP, to go out must rely on gateways to help us out.

②, this is the gateway received the virtual machine after the broadcast, response to the virtual machine response to the virtual machine to tell their own MAC address, so the client found the route exit.


③, this package is the wget command to the system configured DNS server to propose domain name resolution request (precisely should be wget initiated a DNS resolution system call), the request domain www.linux178.com, Expect to get the IP6 address (AAAA represents the IPV6 address)

④ packet, this DNS server to the system response, it is clear that the current use of IPv6 is still very few, so the AAAA record of the

⑤, this is still a request to resolve the IPV6 address, but www.linux178.com.leo.com this hostname is not there, so the result is no such name


⑥, this is the requested domain name corresponding to the IPV4 address (a record)

⑦ packet, DNS server whether it is from the cache, or iterative query finally got the IP address of the domain name, response to the system, the system gave the wget command, Wget then got the IP address of www.linux178.com, and here you can see that the client and the local DNS server are recursive queries (that is, the server must give the client a result) This can start the next step, TCP three handshake.


2. Initiate TCP's 3 times handshake


After you get the IP address of the domain name, user-agent (generally refers to the browser) will initiate TCP connection requests to the server's Web program (commonly used Httpd,nginx, etc.) with a random port (1024 < port < 65535). This connection request (the original HTTP request passes through the layer-layer envelope of the TCP/IP4 layer model) arrives at the server side (in this intermediate through various routing devices, outside the LAN, access to the NIC, and then into the kernel of the TCP/IP protocol stack (used to identify the connection request, unpack the package, peel off layer by layer), It is also possible to filter through the NetFilter firewall (which is a kernel module), eventually reaching the Web program (this article takes Nginx as an example) and eventually establishes a TCP/IP connection.

The following figure:

1 The client first sends a connection test, ack=0 indicates that the confirmation number is invalid, SYN = 1 means that this is a connection request or connection acceptance message, and that the datagram cannot carry data, seq = x represents the client's own initial sequence number (seq = 0 Represents this is package No. 0), Waiting for the client to enter the Syn_sent state, which means that clients wait for the server to reply

2 after the server hears the connection request message, if agrees to establish the connection, sends the confirmation to the client. The SYN and ACK in the TCP message header are set to 1, ack = x + 1 means expecting to receive the first byte ordinal of the next segment of the message is x+1, indicating that all data for X has been received correctly (Ack=1 is actually ack=0+1, which is the 1th package expected by the client), seq = Y represents the server's own initial sequence number (Seq=0 represents this is the NO. 0 package issued by the server side). At this point the server entered the SYN_RCVD, indicating that the server has received the client's connection request, waiting for client confirmation.

3 after receiving the confirmation, the client needs to send the confirmation again and carry the data to be sent to the server. ACK 1 indicates that the confirmation number ack= y + 1 is valid (the representative expects to receive the 1th packet of the server), and the client's own serial number seq= X + 1 (indicating that this is my 1th package, relative to the No. 0 packet), once the client is confirmed, This TCP connection enters the established state, and you can initiate an HTTP request.

Watch Grab screenshot:


⑨, this is the corresponding step 1.

⑩ number Package This corresponds to step 2 above)

Number package This corresponds to step 3 above)


Why does TCP need to shake hands 3 times?


As an example:


Suppose a foreigner is lost in the Palace Museum, see Xiaoming, so there is the following dialogue:


Foreigner: Excuse Me,can you Speak 中文版?

Xiaoming: Yes.

Foreigner: Ok,i Want ...


Before asking the way, the foreigner asked Xiaoming to speak English first, Xiaoming replied yes, then the foreigner began to ask the way


2 computer communication is by the protocol (the current popular TCP/IP protocol) to achieve, if 2 computers use the same protocol, it is not able to communicate, so this 3 handshake is equivalent to the temptation to follow the TCP/IP protocol, after the consultation is completed can communicate, Of course, this understanding is not so accurate.


Why HTTP protocols are implemented based on TCP.


At present, all the transmission in the Internet is done through TCP/IP, the HTTP protocol as the application layer protocol in the TCP/IP model is no exception, TCP is an end-to-end reliable connection-oriented protocol, so HTTP is based on the transport-layer TCP protocol without worrying about the data transmission problems.


3. Initiate an HTTP request after establishing a TCP connection


After TCP3 handshake, the browser initiates the HTTP request (see package), uses the HTTP method get method, the requested URL is/, the protocol is http/1.0


The details of package 12th are as follows:


The above message is the HTTP request message.


So what is the format of HTTP request messages and response messages?


Start line: such as get/http/1.0 (protocol used for URL requests requested by the requested method)

Header information: The value of the user-agent host, etc.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.