Http protocol analysis for Android Development
HTTP request Model
1. connect to a Web server a client application (such as a Web browser) to open a socket (80 by default) to the HTTP port of the Web server ).
Example: http://www.myweb.com: 8080/index.html
In Java, this is equivalent to the Code:
Soceet socket = new Socket ("www.myweb.com", 8080 );
InputStream in = socket. getInputStream ();
OutputStream out = socket. getOutputStream ();
2. Send an HTTP request over a connection. The client writes an ASCII text request line, followed by zero or more HTTP header labels, one blank line, and any data requested.
A request consists of four parts: request line, request header mark, blank line, and request data
1. Request Line: The request line consists of three tags: Request Method, request URI, and HTTP Version. They are separated by spaces.
Example: GET/index.html HTTP/1.1
The HTTP specification defines eight possible request methods:
GET retrieves a simple request that identifies a resource in a URI.
The HEAD method is the same as the GET method. The server only returns the status line and header label, and does not return the request document.
The POST Server accepts requests for data written into the output stream of the client.
The PUT server saves the request data as a request to specify the new content of the URI.
DELETE server request to DELETE the resource named in URI
OPTIONS requests for server-supported request methods
The TRACE Web server provides feedback on Http requests and headers.
CONNECT is a documented but not implemented method. It is reserved for tunnel processing.
2. Request Header: it consists of key/value pairs. Each line has a pair. The keywords and values are separated by colons.
The request header is used to notify the server of the functions and identifiers of the client. A typical request header is marked:
User-Agent client manufacturer and version
List of content types that can be recognized by the Accept client
Number of bytes of Content-Length appended to the request
3. blank line: the last request header is followed by an empty line, which sends a carriage return and a fallback message, notifying the server that there is no header mark below.
4. Request data: the Content-Type and Content-Length headers are usually used for data transmission using POST.
3. The server accepts the request and returns an HTTP Response to the Web server's resolution request to locate the specified resource. The server writes a copy of the resource to the socket, which is read by the client.
A response consists of four parts: Status line, Response Header mark, blank line, and response data.
1. Status line: the Status line consists of three tags: HTTP Version, response code, and response description.
HTTP Version: specify the maximum version that the client can understand.
Response Code: A three-digit numeric code that indicates whether the request succeeds or fails. If the request fails, it indicates the reason.
Response Description: the readability of the response code.
Example: HTTP/1.1 200 OK
HTTP response code:
1xx: information, request received, continue processing
2xx: Successful, accepted, understood, and accepted
3xx: redirection. actions that must be performed further to complete the request
4xx: client error:
2. Response Header: like the request header, they indicate the functions of the server and identify the details of the response data.
3. blank line: the last response header is followed by an empty line. The carriage return and fallback are sent, indicating that no header mark is available below the server.
4. Response Data: HTML documents and images, that is, HTML itself.
4. When the server closes the connection, the browser parses the response. 1. The browser first parses the status line to check the status code indicating whether the request is successful.
2. parse each response header, and the header Mark tells the following several bytes of HTML.
3. Read the response data HTML, format it according to the syntax and semantics of the HTML, and display it in the browser window.
4. An HTML document may contain references to other resources to be loaded. the browser recognizes these references and makes additional requests to other resources. This process is repeated multiple times.
5. Stateless connection the HTTP model is stateless, indicating that when processing a request, the Web server does not remember the requests from the same client.
6. Example 1. the browser sends a request for GET/index.html HTTP/1.1
Server Response
HTTP/1.1 200 OK
Date: Apr 11 2006 15:32:08 GMT
Server: Apache/2.0.46 (win32)
Content-Length: 119
Content-Type: text/html
2. the browser sends a request to GET/index.css HTTP/1.1.
Server Response
HTTP/1.1 200 OK
Date: Apr 11 2006 15:32:08 GMT
Server: Apache/2.0.46 (win32)
Connection: Keep-alive, close
Content-Length: 70
Content-Type: text/plane
H3 {
Font-size: 20px;
Font-weight: bold;
Color: # 005A9C;
}
3. the browser sends a request to GET http://www.bkjia.com/uploads/allimg/140921/041635H57-0.png HTTP/1.1
Server Response
HTTP/1.1 200 OK
Date: Apr 11 2006 15:32:08 GMT
Server: Apache/2.0.46 (win32)
Connection: Keep-alive, close
Content-Length: 1280
Content-Type: text/plane
{Binary image data follows}
(Appendix)
1. HTTP specification: RFC published by the Internet Engineering Development Organization (IETF) specifies Internet standards, which are widely accepted by Internet research and development institutions. Because they are standard documents, they are generally written in formal languages, like the legislative document.
2. RFC: once an RFC is proposed, it is numbered and will not be changed. When a standard is modified, a new RFC is given. As a standard, RFC is widely used on the Internet.
3. Several important RFC for HTTP:
RFC1945 HTTP 1.0 description
RFC2068 initial description of HTTP 1.1
RFC2616 standard for HTTP 1.1
4. Resource Identifier URI (Uniform Resource Identifter, URI)
HTTP reference
1. The HTTP response code consists of three decimal digits, which appear in the first line of the response sent by the HTTP server.
There are five types of response codes, represented by their first digit:
1.1xx: information, request received, continue processing
2.2xx: Successful, accepted, understood, and accepted
3.3xx: redirection. actions that must be performed further to complete the request
4.4xx: client error. The request contains a syntax error or the request cannot be implemented.
5.5xx: server error. The server cannot implement an obviously invalid request.
The following table shows each response code and its meaning:
100 continue
101 group Exchange Association
200 OK
201 created
202 accepted
203 unauthorized information
204 NO content
205 reset content
Part 1
300 multiple options
301 permanent transfer
302 found
303 see other
304 not modified
305 use proxy
307 temporary redirection
400 Error request
401 unauthorized
402 payment required
403 Forbidden
404 not found
405 methods not allowed
406 not accepted
407 proxy authorization required
408 request timeout
409 conflict
410 expired
411 Length
412 precondition is not true
413 the request instance is too large
414 the request URI is too large
415 unsupported media types
416 request range not met
417 failure expectation
500 Internal Server Error
501 unused
502 Gateway error
503 unavailable services
504 gateway timeout
505 HTTP Version Not Supported
2. the HTTP header is composed of primary key/value pairs. They describe the properties of the client or server, the transferred resources, and the connection to the server.
Four different types of header labels:
1. General header: it can be used for requests or responses. It is associated with transactions as a whole rather than a specific resource.
2. Request Header: allows the client to pass information about itself and the desired response form.
3. Response Header: the server and the response that transmits its own information.
4. Object Header: defines the information of the transferred resource. Can be used for requests or responses.
Header Format: :
The following table describes the header labels used in HTTP/1.1.
Accept defines the media types that can be processed by the client, sorted by priority;
You can define multiple types and use wildcards in a comma-separated list. For example: Accept: image/jpeg, image/png ,*/*
Accept-Charset defines the character sets that can be processed by the client, sorted by priority;
You can define multiple types and use wildcards in a comma-separated list. Example: Accept-Charset: iso-8859-1, *, UTF-8
Accept-Encoding defines the Encoding mechanism that the client can understand. Example: Accept-Encoding: gzip, compress
Accept-Language defines the natural Language list that the client is willing to Accept. Example: Accept-Language: en, de
An Accept-Ranges response header that allows the server to specify that the request will be accepted for the resource component at the given offset and length.
The value of this header is considered as the measurement unit of the Request range. For example, Accept-Ranges: bytes or Accept-Ranges: none
Age allows the server to specify the length of time that has elapsed since the server generated the response, in seconds.
This header is mainly used to cache responses. Example: Age: 30
Allow is a response header that defines a list of HTTP methods supported by the secondary source in the request URI. Example: Allow: GET, PUT
AUTHORIZATION: A Response Header that defines the aUTHORIZATION required to access a resource (domain and encoded user ID and password ).
Example: Authorization: Basic YXV0aG9yOnBoaWw =
Cache-Control a general header used to define Cache commands. Example: Cache-Control: max-age = 30
Connection indicates whether to save the socket Connection as an open general header. For example, Connection: close or Connection: keep-alive
Content-Base is an object header that defines the basic URI. to parse the object relative to URLs within the object range.
If the Content-Base header is not defined to parse relative URLs, use the Content-Location URI (exists and is absolute) or use the URI request.
For example, Content-Base: Http: // www.myweb.com
Content-Encoding is a media type modifier that specifies how an object is encoded. Example: Content-Encoding: zip
Content-Language is used to specify the natural Language type of data in the input stream. For example, Content-Language: en
Content-Length specifies the Length of bytes contained in the data in the request or response. Example: Content-Length: 382
Content-Location specifies the resource Location (URI) contained in the request or response ).
For example. The URL is also the starting point of the relative URL of the object to be parsed.
Example: Content-Location: http://www.myweb.com/news
An MD5 Digest of A Content-MD5 entity used as a Checksum.
Both the sender and receiver calculate the MD5 Digest, and the value calculated by the receiver is compared with the value passed in the header.
Example: Content-MD5:
Content-Range is sent along with some objects. It indicates the offset between the low and high bytes of the inserted bytes, and the total length of the object.
Example: Content-Range: 1001-2000/5000
Contern-Type indicates the MIME Type of the sent or received object. For example, Content-Type: text/html
The Date on which the HTTP message is sent. Example: Date: Mon, 10PR 18:42:51 GMT
ETag is an entity header that assigns a unique identifier to the sent resource.
For resources that can use multiple URL requests, ETag can be used to determine whether the actually sent resource is the same resource.
Example: ETag: "208f-419e-30f8dc99"
Expires specifies the object validity period. Example: Expires: Mon, 05 Dec 2008 12:00:00 GMT
Form is a request header that specifies the email address of the manual user who controls the user proxy. Example: From: webmaster@myweb.com
Host Name of the requested resource. This domain is mandatory for requests that use HTTP/1.1. Example: Host: www.myweb.com
If-Modified-Since contains a GET request, the request is conditionally dependent on the date when the resource was last Modified.
If the header mark is displayed and the resource has been modified since the specified date, a 304 response code should be returned.
Example: If-Modified-Since: Mon, 10PR 18:42:51 GMT
If-Match: If a request is contained, one or more object tags are specified. Only the ETag of the instance is sent to the resource marked with a partition in the list.
Example: If-Match: "208f-419e-308dc99"
If-None-Match: If a request is contained, one or more object tags are specified. The resource's ETag does not match any of the conditions in the list before the operation is executed.
Example: If-None-Match: "208f-419e-308dc99"
If-Range specifies an object tag of a resource, and the client already owns a copy of the resource. Must be used with the Range header.
If the object has not been modified since it was last retrieved by the client, the server sends only the specified range. Otherwise, the server sends the entire resource.
Example: Range: byte = 0-499 If-Range: "208f-419e-30f8dc99"
If-Unmodified-Since returns this object only when the requested object has not been modified Since the specified date.
Example: If-Unmodified-Since: Mon, 10PR 18:42:51 GMT
Last-Modified specifies the date and time when the requested resource was Last Modified. Example: Last-Modified: Mon, 10PR 18:42:51 GMT
Location: a resource that has been moved is used to redirect the requester to another Location.
Used with status code 302 (temporarily moved) or 301 (permanently moved.
Example: Location: http://www2.myweb.com/index.jsp
Max-Forwards: A request header used for the TRACE method to specify the maximum number of proxies or gateways. This request can be routed through the gateway.
The number of proxies or gateways should be reduced before passing requests. Example: Max-Forwards: 3
Pragma is a common header that sends implementation-related information. Example: Pragma: no-cache
Proxy-Authenticate is similar to WWW-Authenticate, and is intended to request authentication only from the next server of the Request chain (Proxy.
Example: Proxy-Authenticate: Basic realm-admin
Proxy-Authorization is similar to Authorization, but it does not intentionally pass anything further than in the instant server chain.
Example: Proxy-Authorization: Basic YXV0aG9yOnBoaWw =
The Public List displays the method sets supported by the server. Example: Public: OPTIONS, MGET, MHEAD, GET, HEAD
Range specifies the offset Range of a measurement unit and a part of the requested resource. Example: Range: bytes = 206-5513
Refener is a request header field that indicates the initial resource that generates the request. For an HTML form, it contains the address of the web page of the form.
Example: Refener: http://www.myweb.com/news/search.html
Retry-After is a response header field that is sent by the server in combination with status code 503 (unable to provide services) to indicate how long it should wait before the request is resumed.
This time can be a date or a unit of seconds. Example: Retry-After: 18
Server indicates the header of the Web Server software and its version number. Example: Server: Apache/2.0.46 (Win32)
Transfer-Encoding is a common header label that indicates the type of the message body to be reversed by the recipient. Example: Transfer-Encoding: chunked
Upgrade allows the server to specify a new protocol or Protocol version, which can be used with response code 101 (switching protocol.
Example: Upgrade: HTTP/2.0
The User-Agent defines the software type (such as a Web browser) used to generate requests ).
Example: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT; DigExt)
Vary is a response header label used to select a response entity from the available response representation using server-driven negotiation. Example: Vary :*
A common header that contains all intermediate hosts and protocols is used to meet the request. Example: Via: 1.0 fred.com, 1.1 wilma.com
Warning is used to provide the Response Header mark for response status supplement information. Example: Warning: 99 www.myweb.com Piano needs tuning
Www-Authenticate: A Response Header that prompts the user agent to provide the user name and password. It can be used with Status Code 401 (unauthorized. Returns an Authorization header.
Example: www-Authenticate: Basic realm = zxm. mgmt