Network Programming (3)--on the HTTP protocol

Source: Internet
Author: User
Tags ack

background We know in Network programming (1) that communication between computers can be used to locate resources by means of URLs. This method is implemented based on the TCP/IP protocol family . The so-called based on the TCP/IP protocol family, in fact (1) to obtain network resource positioning to understand better, from the IP and port to understand:
    • IP protocol is used by computers to identify each other. A mechanism for communication in which each computer has an IP. Used to identify this computer on the Internet. IP protocol just allows computers to send messages to each other , it does not check whether the message arrives in the order it was sent and is not corrupted (only the critical header data is checked). In order to provide the message checking function, the TCP protocol of Transmission Control protocol is designed directly on the IP protocol.
    • TCP protocol Make sure the packets arrive in the correct order, and try to confirm that the contents of the packet have not changed. TCP is a port on top of an IP address that allows a computer to provide a variety of services over the network. Some port numbers are reserved for different services, and these port numbers are well known.
In General, these two protocols work together. TCP is responsible for communication between the application software (browser) and the network software (on the service side). IP is responsible for communication between computers. TCP is responsible for splitting the data and loading the IP packets, the IP is responsible for sending the packet to the recipient, the transfer process through the IP router is responsible for the traffic, network errors or other parameters to properly address, and then when they arrive to regroup them.
The HTTP protocol (hypertext Transfer Protocol, Hypertext Transfer Protocol) is the transfer protocol used to transfer the text from the WWW server to the local browser. It can make the browser more efficient and reduce the network transmission. It not only ensures that the computer transmits hypertext documents correctly and quickly, but also determines which part of the document is being transmitted, and which content is displayed first (such as text before graphics), and so on. HTTP is based on TCP protocol (with the help of pictures on the Internet) Network Model
Response Model

because HTTP is a stateless protocol. stateless refers to the absence of a persistent connection between the client (Web browser) and the server, which means that when a client makes a request to the server and then the server returns a response (response), the connection is closed and the connection information is not maintained on the server side. HTTP follows the request/answer (Response) model. The client (browser) sends a request to the server, which processes the request and returns the appropriate answer. All HTTP connections are constructed as a set of requests and responses. Work Flowtest data all using the Grab Kit tool Wireshark Click the Download Grab Kit tool

(1) Domain name resolution address

from the network programming (1) See, we visit www.baidu.com is just a domain name, need a "domain Name System" to help us find the real IP address. In general, it will be viewed from our browser DNS cache to see if we have visited this site before, if there is a corresponding IP entry to access, if not, the default will be from its own host file ( located in C:\Windows\System32\drivers\ etc), if there is direct access, if there is no need to request our locally configured DNS server (this is usually provided by our network provider), initiating a domain name resolution request (through the UDP protocol to the DNS port 53 to initiate the request, This request is a recursive request), the operator's DNS server first looks for its own cache, finds the corresponding entry, and does not expire, the resolution succeeds. If the corresponding entry is not found, then there is the operator's DNS to initiate an iterative DNS resolution request for our browser. First, from the DNS IP address of the root domain (a), ask a "What is the IP address of the www.baidu.com?" "A to see not know this IP address is how much, but he knew B tube" com "Such a top-level domain IP address, so told me to find B. Find B and ask him "What is the IP address of the www.baidu.com?" "But B only Know" baidu.com "This kind of permission domain name to C tube, let me go to C. Then he ran to C and asked, "What is the IP address of www.baidu.com?" ", finally C on their own server on the search found that there is a corresponding IP entry, returned to tell us." And back to the Windows system kernel, the kernel returns the results to the browser, finally the browser has been taken and cached.

(2) 3-time handshake initiated by TCP

So we went in. (1) After resolving the IP address, we need to establish a connection between two hosts, divided into 3 handshake.


    • The client first sends a connection heuristic, ack=0 indicates that the confirmation number is invalid, and SYN = 1 indicates that this is a connection request or a connection acceptance message, while indicating that the datagram cannot carry data, and seq = x represents the client's own initial sequence number (seq = 0 means this is the No. 0 packet), At this time the client enters the Syn_sent state, indicating that the clients wait for server reply

    • After the server has heard the connection request message, if it agrees to establish a connection, it sends a confirmation to the client. The SYN and ACK in the TCP header is set to 1, and ack = x + 1 indicates that the first data byte ordinal that expects to receive the next segment of the message is x+1, indicating that all data up to X is received correctly (Ack=1 is actually ack=0+1, which is the 1th packet of the expected client), seq = Y represents the server's own initial sequence number (seq=0 is the No. 0 packet issued by the server side). The server then enters SYN_RCVD, indicating that the server has received a connection request from the client and waits for client confirmation.

    • After the client receives the acknowledgement, it also needs to send the confirmation again, carrying the data to be sent to the server. Ack 1 indicates that the confirmation number ack= y + 1 is valid (represents the 1th packet expected to receive the server), the client's own sequence number seq= X + 1 (indicating that this is my 1th package, relative to the No. 0 packet), once received the client's confirmation, This TCP connection enters the established state and the HTTP request can be initiated.


(3) initiating an HTTP request after establishing a TCP connection

View HTTP request contents (red boxes are request lines and request headers, respectively)
(4) The server responds to HTTP requests and the browser gets HTML code

After (1)-(4) The browser parses the HTML code and requests the resources (pictures, etc.) in the HTML code and renders the page to the user. we see connection:keep-alive from inside the request header. In general, a TCP connection is released when a connection request is complete. Because of the frequent interaction with the server , Keep-alive is able to maintain a "connection" to the server, eliminating the need to disconnect each time and reduce overhead.

Transport Model
from what we can see, the data is split from the top level to know the bottom link layer, and its transmission path is

the client encapsulates the request into an HTTP packet--encapsulated as a TCP packet--encapsulated into an IP packet---> encapsulated into a data frame---> Hardware converts the frame data into a bit stream (binary data)-- Finally, the physical hardware (network card chip) is sent to the specified location , and the server hardware receives a bit stream and then translates it into an IP packet. The IP packet is then resolved through the IP protocol, and then the TCP packet is found, then the TCP packet is resolved through the TCP protocol, and then the HTTP packet is found to be the HTTP packet to parse the HTTP packets to get the data.

Summary Features
    • Mode: Supports client/server. (Client ← Service side)
    • simple and flexible: When a customer requests a service from the server, it simply transmits the request method and path. The request method commonly has the get, the post and so on method, the method stipulates that the customer and the server contact type is different. At the same time , HTTP allows any type of data object to be transmitted. The type being transmitted is marked by Content-type.
    • No connection: The meaning of no connection is to limit the processing of only one request per connection. When the server finishes processing the customer's request and receives the customer's answer, the connection is disconnected.
    • Stateless: Stateless means that the protocol has no memory capacity for transactional processing. A lack of state means that if the previous information is required for subsequent processing, it must be re-transmitted.




Network Programming (3)--on the HTTP protocol

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.