I want to share with you an article on the HTTP protocol. It is very helpful for new users who are waiting for network analysis.
Original post address: http://www.csna.cn/viewthread.php? Tid = 371
I. Purpose
I have been learning network analysis for some time. Now I am getting started. I think I didn't know where to start when I first came into contact. To learn about network analysis, you must start with the protocol. If you only read the relevant materials, you will not be deeply impressed, and you will not have any practical experience. The speed and efficiency will be much slower. Later, some people gave me some advice and learned the Protocol better by using network analysis software. From then on, I began to learn another method. Oh, as a beginner, I want to write out my learning methods and communicate with you. I personally think that methods are the most important.
There are too many protocols in the TCP/IP protocol family, so it still depends on time accumulation. :) so much nonsense, hey!
In fact, the application layer protocol is a commonly used protocol, such as HTTP and FTP. Next we will talk about the HTTP protocol (it is impossible to ignore the web page -. -#). You can use a similar method to learn other protocols. ^ O ^
Ii. Test Environment
The network environment we tested here is simple and a brief description:
1. Network Environment: My computer uses the proxy server ADSL dial-up to access the Internet. The IP address of this computer is 192.168.0.92;
2. Operating System: Windows XP + SP2 is used on my computer;
3. tools (Key Tips): I have been familiar with sniffer, omnipeek, ethereal, and kelai network analysis systems. I am a little familiar with these software systems, here I will use omnipeek as a learning tool.
Iii. Specific operations
My experiment is also very simple. Do you know how to access the webpage? Here I am only going to learn more about the webpage access process and principles in a deeper way. The specific operations are as follows:
1. Enable ominpeek settings, and set address filer: 192.168.0.92 and protocl filter in filers to select HTTP protocol, 1,
(Figure 1 filter settings)
Note: by setting the filter in Figure 1, we can only capture communication using HTTP protocol 192.168.0.92 :)
2. Start capturing, access the Web http://www.csna.cn with host 192.168.0.92, capture packets, 2,
(Figure 2 captured data packets)
Note: Well, we are ready to start capturing the packets we captured at the http: \ www.csna.cn forum.
Iv. Principles
The use of network analysis software for protocol learning does not mean that we all put aside those materials and those theoretical principles. What I am talking about here is "integration" to understand the protocol structure and working principles, relevant features are very important. Then we will be more explicit when operating through network analysis software. Let's first introduce how the HTTP protocol works (don't blame me ).
1. TCP/IP layered structure
The layer structure and working principle of TCP/IP are not described in detail. Here we will describe http: Simply put, HTTP is an application layer protocol, reliable connection through TCP at the lower transmission layer, network layer IP routing, link layer Ethernet II, and physical second-bit transmission.
Application Layer ----------- HTTP
Transport Layer ----------- TCP
Network Layer ----------- IP
Link Layer ----------- Ethernet II
2. How HTTP works
Because the HTTP protocol is based on the request/response mode (equivalent to the Client/Server ). After a client establishes a connection with the server, it sends a request to the server in the format of Uniform Resource Identifier (URL) and Protocol version number, the mime information is followed by the request modifier, client information, and possible content. After receiving the request, the server sends a response in the format of a status line, including the Protocol version number of the message, a successful or wrong code, mime information is followed by server information, entity information, and possible content.
In this process, we can call the seller to tell him what type of products we need, and then tell us what products are available and what products are out of stock. In this case, we call through a telephone line (HTTP is through TCP/IP ).
The internal operation process of the HTTP protocol: the Information Exchange Process of the client/server mode based on the HTTP protocol. It consists of four processes: establishing a connection, sending request information, sending response information, and closing a connection. This is like the above example, the whole process of telephone order.
Simply put, in addition to HTML files, any server also has an HTTP resident program to respond to user requests. Your browser is an HTTP client and sends a request to the server. When a starting file is entered in the browser or a hyperlink is clicked, the browser sends an HTTP request to the server, this request is sent to the URL specified by the IP address. The resident program receives the request and sends the requested file back after necessary operations. In this process, the data sent and received on the network has been divided into one or more packages, each of which includes: data to be transmitted; control information, it tells the network how to process data packets. TCP/IP determines the format of each data packet.
V. Data Packet Analysis
Now let's analyze the captured data packets to see how the HTTP protocol establishes a connection, sends request information, sends response information, and closes the connection.
1. Establish a connection
The 1.2.3 packet is the process of establishing a connection through the three-way handshake principle using the lower-layer TCP protocol in the HTTP protocol, as shown in 3,
(Figure 3 establish a connection)
Note: The three data packets selected in Figure 3 describe the TCP three-way handshake process. In this way, we can know that HTTP Communication occurs over TCP, and the default port is TCP port 80, so HTTP is a reliable protocol.
For the three-way handshake principle, check decoding:
The preceding three figures show the TCP tag information of the first three data packets, reflecting the TCP three-way handshake process: the client sends a SYN synchronous connection request to the web server, after receiving the request, the web server sends a SYN/ACK packet to the client, agrees to the client connection, and initiates synchronization to the client. After receiving the packet, the client confirms the request again to establish a TCP connection.
2. Send request information
Next we will observe the fourth packet, which is the HTTP request packet initiated by host 192.168.0.92, 4,
(Figure 4 send request information)
Figure 4 shows some features of HTTP request information from packet decoding. After a connection is opened, the client sends the request message to the server's Stop port to complete the request.
Figure 4 HTTP Request Message
HTTP command: // Method Field, indicating that the get method is used
Uri: // URL field, send a request to the server that saves the website.
HTTP Version: // HTTP Protocol version field, which is HTTP/1.1
Accept: // indicates the list of media types that can be responded to by an accepted request.
Accept-language: // limits the preferred language in the Request Response to simplified Chinese. Otherwise, the default value is used.
Accept-encoding: // limits the acceptable content encoding values in the response, indicating that the additional content is decoded using gzip and deflate.
User-Agent: // defines the user proxy, that is, the browser type for sending the request is Mozilla/4.0
HOST: www.csna.cn \ r \ N // defines the host of the target
Connection: keep-alive \ r \ N // tell the server to use persistent connection
3. Send Response Information
The 6th data packets are the response packet information of the server, as shown in figure 4,
(Figure 5 Response Message)
Analysis decoding:
The server sends a Response Message to the client after processing the customer's request.
Figure 5 HTTP Response Message
HTTP Version: HTTP/1.1 // The server uses HTTP/1.1
HTTP status: 200 // The request is successful. The information can be read and contained in the response message.
Date: // The time when the server retrieves the object from the file system, inserts it into the response message, and sends the Response Message.
Server: // indicates that the engraved packet is generated by an Apache/2.0.52 server.
X-powered-by: // indicates a dynamic webpage using PHP (version ).
Set-COOKIE ://
Vary ://
Content-Length: // indicates the object length.
Connection: // tell the client to keep the connection after the packet is sent
Content-Type: // indicates that the object in the object is an HTML document.
Binary data: // binary data
Note: In the Response Request from the server, we can view the access information from the status code.
The status code indicates the response type, which is commonly used as follows:
1 × Reserved
2×× indicates that the request is successfully received.
3×× further refine the request to complete the request
4×× customer Error
5×× Server Error
In the packet we caught, the status code is 200, indicating that the request was successfully accepted.
4. Close the connection
The last four data packets (52.53.54.55) are the closing processes of communication. Three handshakes are required to establish a connection, and four handshakes are required to terminate a connection. This is because the TCP connection is full and must be closed separately in each direction. The four handshakes are actually closed separately in two directions. This is not detailed here,
Vi. Summary
From the above explanation, I think we have learned a lot about the HTTP protocol. Do you think it is a little different from that of the comparative analysis software, we can see in the data packet decoding that the data is just a little different. Oh, is it more complicated in terms of information? It also shows that it is different in actual situations. Here, we just want to talk about the learning ideas. The TCP/IP protocol is very complicated and a good learning method, I hope you can talk about your learning methods. Communication