Interpreting HTTP packets: [Abstract] describes in detail the HTTP packet format, protocol content, and related processing methods. The content is divided into three sections: 1. Hypertext Transfer Protocol and HTTP packet; 2. socket and serversocket; 3. Read HTTP packets.
I. Hypertext Transfer Protocol and HTTP packet
HTTP is used to send and receive messages over the Internet. HTTP is a request-response protocol. A client sends a request and the server returns the response to the request. All requests and responses are HTTP packets. The HTTP protocol uses reliable TCP connections. The default port is 80. The first version of HTTP is HTTP/0.9, which was later developed to HTTP/1.0. The latest version is HTTP/1.1. HTTP/1.1 is defined by RFC 2616.
In HTTP, the session between the client and server is always initialized by the client by establishing a connection and sending an HTTP request packet. The server does not actively contact the client or require a connection with the client. Both the browser and server can interrupt the connection at any time. For example, you can click the "stop" button at any time to interrupt the current file download process and close the HTTP connection to the web server.
1. http request package
HTTP request packets (get, post, and other request methods) are composed of three parts: method-Uri-Protocol/version, request header, and request body. The following is an example of an HTTP request package (get:
GET/index. jsp HTTP/1.1
Accept-language: ZH-CN
Connection: keep-alive
HOST: 192.168.0.106
Content-Length: 37
Username = new_andy & Password = new_andy
The first line of the request packet is method-Uri-Protocol/version:
Get is the request method. According to HTTP standards, HTTP requests can use multiple request methods. HTTP 1.1 supports seven request methods: Get, post, Head, options, put, delete, and trace. Common Request methods include get and post.
/Index. jsp indicates Uri. Uri specifies the network resource to be accessed.
HTTP/1.1 is the protocol and Protocol version.
The last line username = new_andy & amp; Password = new_andy is the body, and the body is separated by an empty line (\ r \ n) in the HTTP header. Here, we need to describe one point. Content-Length indicates the length of the body. Some body lengths are not described in the header, but indicate transfer-encoding: chunked. For details about how to calculate the length of the chunked type, see RFC 1626.
The request packet header also contains many useful information about the client environment and request body, which is not described here.
2. Http response packet
Similar to HTTP request packets, an HTTP request package consists of three parts: Protocol-status code-description, response header, and response body. The following is an example of an HTTP response:
HTTP/1.1 200 OK
Server: Microsoft-Microsoft IIS/4.0
Date: Mon, 3 Jan 2005 13:13:33 GMT
Content-Type: text/html
Last-modified: Mon, 11 Jan 2004 13:23:42 GMT
Content-Length: 90
<HTML>
<Head>
<Title> example of interpreting HTTP packets </title> Hello world!
</Body>
</Html>
The first line of the HTTP response packet is similar to the first line of the HTTP request, indicating that the protocol used is HTTP 1.1, and the server processes the Request status code 200.
The Response Header also contains a lot of useful information like the request header, such as the server type, date and time, content type, and length. The response body is the HTML page returned by the server. The response header and body are also separated by CRLF.
Ii. socket and serversocket
In Java, the communication endpoint is represented by the java.net. Socket class (client) or java.net. serversocket class (server. The application sends data to or reads data from the network through the endpoint. Applications located on two different machines send and receive byte streams over a network connection to achieve communication. To send an HTTP packet to another application, you must first know the IP address of the other party and the port number of the Communication endpoint.
The socket class represents the client, which is an endpoint temporarily created when connecting to a remote server application.
The serversocker class represents the server. After it is started, it waits for connection requests from the client. Once the request is received, serversocket creates a socket instance to process communication with the client. For server applications, we do not know when the client application will attempt to connect to the server. The server must remain in the waiting state.
Below are four constructor functions provided by serversocket. One of the common constructor forms is:
Public serversocket (INT port, int backlog, inetaddress bindingaddress );
Parameter: Port specifies the port on which the server listens to the client;
Backlog is the maximum queue length of the connection request. Once the length is exceeded, the server endpoint rejects the client connection request.
Bindingaddress is an instance of java.net. inetaddress, specifying the bound IP address.
After the serversocket instance is created, call its accept method to wait for the incoming connection request. The accept method returns only when a connection request is sent. The returned value is an instance of the socket class. Then, the socket object can be used to communicate with the client application.
The socket class has many constructors, which are commonly used as follows:
Public socket (string host, int port ).
The parameter is the host name (IP address or domain name) and port number.
The host parameter is the name or IP address of the remote machine, and the port is the port number of the Remote Application.
After a socket class instance is successfully created, we can use it to send and receive data in the form of byte streams. The data is generally an HTTP packet.
To send a byte stream, you must first call the getoutputstream method of the socket class to obtain a java. io. outputstream object; To receive byte streams from the other end of the connection, you must first call the getinputstream method of the socket class to obtain a java. io. inputstream object.
The following code snippet creates a socket for communication with the local HTTP server (127.0.0.1 represents the IP address of the local host), sends an HTTP request packet, and prepares to receive the server's response.
Socket socket = new socket ("127.0.0.1", "80 ");
Outputstream OS = socket. getoutputstream ();
Inputstream ins = socket. getinputstream ();
Stringbuffer sb = new stringbuffer ();
SB. append ("Get/index. jsp HTTP/1.1 \ r \ n"); // note that \ r \ n is a carriage return line break.
SB. append ("Accept-language: ZH-CN \ r \ n ");
SB. append ("connection: keep-alive \ r \ n ");
SB. append ("Host: 192.168.0.106 \ r \ n ");
SB. append ("Content-Length: 37 \ r \ n ");
SB. append ("\ r \ n ");
SB. append ("username = new_andy & Password = new_andy \ r \ n ");
SB. append ("\ r \ n ");
// Send an HTTP request packet to the Web server
OS. Write (sb. tostring (). getbytes ());
The general structure of the server code is as follows:
While (! Shutdown ){
Socket socket = NULL;
Try {
Socket = serversocket. Accept (); // wait for the client to send an HTTP request packet
// Create an HTTP request Package Processing thread
Requestthread request = new requestthread (socket );
Request. Start ();
If (shutdown) system. Exit (0 );
}
Catch (exception e ){
E. printstacktrace ();
}
}
The requestthread thread analyzes the HTTP request packet and generates an HTTP response packet on the server based on the request packet content. The next section describes how to analyze HTTP packets.
Inputstream input = socket. getinputstream (); // obtain the HTTP request packet content from the byte data stream
Outputstream output = socket. getoutputstream (); // write HTTP response packet content to this throttling
3. Read the http package
The following is a socketrequest class for reading HTTP packets.
Public class socketrequest {// read data from the inputstream of the specified socket
Private inputstream input;
Private string URI;
Private stringbuffer request = new stringbuffer (); // used to save all content
Private int content_length = 0; // The Data Length of the actual package content
Private Boolean bepost = false;
Private Boolean behttpresponse = false;
Private Boolean bechucked = false;
Private Boolean beget = false;
Private byte crlf13 = (byte) 13; // '\ R'
Private byte crlf10 = (byte) 10; // '\ N'
Public socketrequest (inputstream input ){
This. Input = input;
}
Public socketrequest (Socket socket ){
This. Input = socket. getinputstream ();
}
Public void readdata () {// parse the inputstream data
Readheader (); // Header
If (bechucked) {// It is chucked
Int chucksize = 0;
While (chucksize = getchucksize ()> 0) {// multiple chucked
Readlendata (chucksize + 2); // read the fixed-length data
}
Readlendata (2); // the last two digits
}
If (content_length> 0 ){
Readlendata (content_length); // read the fixed-length data
}
Uri = ""; // parseuri (new string (request ));
}
Private void readlendata (INT size) {// read the fixed-length data
Int readed = 0; // read count
Try {
Int available = 0; // input. Available (); // readable
If (available> (size-readed )){
Available = size-readed;
} While (readed <size ){
While (available = 0) {// wait until data is readable
Available = input. Available (); // readable
}
If (available> (size-readed )){
Available = size-readed; // size-readed -- remaining number
}
If (available & gt; 2048 ){
Available = 2048; // size-readed -- number of remaining items
}
Byte [] buffer = new byte [available];
Int reading = input. Read (buffer );
Request = request. append (new string (buffer, 0, reading); // Add byte Arrays
Readed + = reading; // read characters
}
} Catch (ioexception e ){
System. Out. println ("read readlendata error! ");
}
}
Private void readheader () {// read the header and obtain the size
Byte [] CRLF = new byte [1];
Int crlfnum = 0; // The number of line breaks connected to the carriage return. crlfnum = 4 indicates the end of the header.
Try {
While (input. Read (CRLF )! =-1) {// read the header
If (CRLF [0] = crlf13 | CRLF [0] = crlf10 ){
Crlfnum ++;
} Else {
Crlfnum = 0;
} // Clear if not
Request = request. append (new string (CRLF, 0, 1); // Add byte Arrays
If (crlfnum = 4 ){
Break;
}
}
} Catch (ioexception e ){
System. Out. println ("read HTTP header error! ");
Return;
}
String tempstr = (new string (request). touppercase ();
// Here I only process the get and post Methods
String strmethod = tempstr. substring (0, 4 );
If (strmethod. Equals ("get") {// before
Beget = true;
} Else if (strmethod. Equals ("Post ")){
Bepost = true;
Getcontentlen_chucked (tempstr );
} Else {
System. Out. println ("unsupported HTTP packet type ");
} // Other types are not supported currently
}
Private void getcontentlen_chucked (string tempstr) {// obtain the Content-Length or whether it is chunked
String SS1 = "Content-Length :";
String ss2 = new string ("transfer-encoding: chunked ");
Int clindex = tempstr. indexof (SS1 );
Int chuckindex = tempstr. indexof (ss2); // chunked type
Byte requst [] = tempstr. getbytes ();
If (clindex! =-1) {// starting from clindex + 1 to \ r \ n
Stringbuffer sb = new stringbuffer ();
For (INT I = (clindex + 16); I ++ ){
If (requst [I]! = (Byte) 13 & requst [I]! = (Byte) 10 ){
SB. append (char) requst [I]);
} Else {
Break;
}
}
Content_length = integer. parseint (sb. tostring (); // the size of the formal HTML file
// System. Out. println ("content_length =" + content_length );
}
If (chuckindex! =-1 ){
Bechucked = true;
}
}
Private int getchucksize () {// Chuck size
Byte [] CRLF = new byte [1];
Stringbuffer SB1 = new stringbuffer ();
Int crlfnum = 0; // The number of line breaks connected to the carriage return. crlfnum = 4 indicates the end of the header.
Try {
While (input. Read (CRLF )! =-1) {// read the header
If (CRLF [0] = crlf13 | CRLF [0] = crlf10 ){
Crlfnum ++;
} Else {
Crlfnum = 0;
} // Clear if not
Sb1.append (char) CRLF [0]);
Request = request. append (new string (CRLF, 0, 1); // Add byte Arrays
If (crlfnum = 2 ){
Break;
}
}
} Catch (ioexception e ){
System. Out. println ("read http package error! ");
Return 0;
}
Return integer. parseint (sb1.tostring (). Trim (), 16); // 16.
}
// Use this method to filter whether the HTTP packet is sent to the target server
Private string parseuri (string requeststring ){
Int index1, index2;
Index1 = requeststring. indexof ('');
If (index1! =-1 ){
Index2 = requeststring. indexof ('', index1 + 1 );
If (index2> index1 ){
Return requeststring. substring (index1 + 1, index2 );
}
}
Return NULL;
}
Public String getdata (){
Return request. tostring ();
}
}
Use this class:
Socketrequest request = new socketrequest (socket); // The socket is the socket instance returned by serversocket. Accept ().
Request. readdata (); // read data
Request. getdata ();
Why should I use this powerful power to read data? Especially when the socket connection sends data, latency often occurs due to network reasons, when the server starts receiving data, it is possible that only part of the data can be obtained from inputstream. When processing data in some places is not at that time, it is possible that only incomplete data or wrong data can be obtained.
There are multiple methods to read bytes from inputstream:
Commonly used int read () and INT read (byte [] B ). When using read (byte []), programmers often make mistakes, because in the network environment, the amount of data read is not necessarily equal to the parameter size.