Simple HTTP parsing and Protocol Parsing
1. official definition of HTTP protocol:
WWW is an application system that uses the Internet as the transmission media. The most basic transmission unit on WWW is Web pages. The work of WWW is provided to the client/server computing model, which consists of a Web browser and a Web server. The two use Hypertext Transfer Protocol (HTTP) for communication. HTTP is a protocol based on TCP/IP. It is an application layer protocol between a Web browser and a Web server. It is a common, stateless, and object-oriented protocol.
The Internet can send information data in three forms:
First, the HTTP protocol, which is also the most common protocol, is implemented based on TCP/IP.
Second, FTP protocol.
Third, TCP/IP is also the underlying protocol, and other methods must pass through it.
To implement TCP/IP, we need to implement socket programming. socket programming is divided into client and server, so we will not go into details here.
2. HTTP protocol instance parsing step 1, we enter the following address http://www.baidu.com in the browser
Step 2: After receiving the address, the browser sends the domain name to the DNS server for domain name resolution. We can ping Baidu's server address,
Step 3: implement the TCP/IP protocol using Socket. The Socket is used, and the Socket Client is on the left side. A Socket is also opened in the server segment of the Socket.
Step 4: The port 80 of the server listens to the link of the client. In this way, the browser establishes a link with the server for data interaction.
An illustration is given as follows:
The server address may not always be the same. There may be many Baidu servers. here we can use the ping command to check which address we are accessing:
3. Sent data
We want to check what data the client sends to the server. The review elements provided by Google are as follows:
The data in the header section is described as follows:
Remote Address: 180.149.131.35: 80
Request URL: the requested url.
Request Method: GET indicates the Request Method.
Status Code: 200. The Status is OK, indicating that the access is successful.
Accept ------ indicates the data type that the browser can receive
Accept-Encoding ------ indicates that the browser can receive compressed data.
Accept-Language ------ zh-cn indicates the Language of the client browser
Host: the address of the accessed Host.
Referer: access domain name address
User-Agent: indicates the client browser.
4. There are three main types of responses for the return type server: HTML, XML, and json.
(1) HTML is mainly used on PC clients. HTML is also returned when the website is accessed on the mobile phone.
(2) XML and json are mainly used in programming with the client to receive data. For Android, json uses more because it saves traffic, but the readability is worse than xml.
(3) sometimes when we need to download something from the server, it will be displayed in the form of IO streams for transmission.