HTTP protocol parsing for Android development

Source: Internet
Author: User
Tags md5 digest ranges response code rfc

HTTP request Model


First, connect to a Web server a client application (such as a Web browser) opens a socket on the HTTP port of the Web server (default is 80).


Example: http://www.myweb.com:8080/index.html
In Java, this is equivalent to code:
Soceet socket=new socket ("www.myweb.com", 8080);
InputStream In=socket.getinputstream ();
OutputStream Out=socket.getoutputstream ();


Second, send an HTTP request through the connection, the client writes an ASCII text request line, followed by 0 or more HTTP headers, a blank line and any data that implements the request.
A request consists of four parts: a request line, a request header, a blank line, and a request data
1. Request line: The request line consists of three tokens: The request method, the request URI, and the HTTP version, which are separated by a space.
Example: get/index.html http/1.1
The HTTP specification defines 8 possible request methods:
GET retrieves a simple request that identifies a resource in the URI
Head is the same as Get method, the server returns only the status row and header, and does not return the requested document
The POST server accepts requests that are written to data in the client output stream
The PUT server holds the request data as a request for the new content of the specified URI
Delete server deletes a request for a named resource in the URI
OPTIONS request for information about request methods supported by the server
The TRACE Web server feeds back requests for HTTP requests and their headers
CONNECT is a documented but not currently implemented method that is reserved for tunnel processing
2. Request Header: Consists of keyword/value pairs, one pair per line, keywords and values separated by a colon (:).
The request header notifies the server about the functionality and identity of the client, and a typical request header is marked with:
User-agent Client Manufacturer and version
List of content types that are recognized by the Accept client
Content-length number of data bytes attached to the request
3. Blank line: After the last request header is a blank line, send a carriage return and regression, notify the server no longer have a header label.
4. Request data: The Content-type and content-length headers are most commonly used to transmit data using post.


Third, the server accepts the request and returns an HTTP response to the Web servers resolution request to locate the specified resource. The server writes a copy of the resource to the socket, where it is read by the client.
A response consists of four parts; status line, response header, blank line, response data
1. Status line: The status line consists of three tokens: HTTP version, response code, and response description.
HTTP version: Indicates to the client the highest version that it can understand.
Response code: A 3-bit numeric code that indicates the success or failure of the request and indicates the cause if it fails.
Response Description: Explains the readability of the response code.
Example: http/1.1 OK
HTTP response code:
1XX: Information, request received, continue processing
2XX: Success, Behavior is successfully accepted, understood and adopted
3xx: Redirect, in order to complete the request, the action must be further performed
4XX: Client Error:
2. Response headers: Like request headers, they indicate the functionality of the server and identify the details of the response data.
3. Blank line: The last response header is followed by a blank line, sending a carriage return and regression, indicating that the server no longer has a header label.
4. Response data: HTML documents and images, i.e. HTML itself.


Four, the server closes the connection, the browser resolves the response 1. The browser parses the status line first to see the status code indicating whether the request was successful.
2. Each response header is then parsed and the header tells the following bytes of HTML.
3. Read the response data HTML, format it according to the syntax and semantics of the HTML, and display it in a browser window.
4. An HTML document may contain other resource references that need to be loaded, the browser recognizes these references, and additional requests are made to other resources, and this process loops multiple times.


V. Stateless connection the HTTP model is stateless, indicating that the Web server does not remember requests from the same client when processing a request.


Vi. Example 1. The browser makes a request get/index.html http/1.1
The server returns a response
http/1.1 OK
DATE:APR 2006 15:32:08 GMT
server:apache/2.0.46 (WIN32)
content-length:119
Content-type:text/html


<HTML>
<HEAD>
<link rel= "stylesheet" href= "Index.css" >
</HEAD>
<BODY>

</BODY>
</HTML>


2. The browser makes a request Get/index.css http/1.1
The server returns a response
http/1.1 OK
DATE:APR 2006 15:32:08 GMT
server:apache/2.0.46 (WIN32)
Connection:keep-alive, close
Content-length:70
Content-type:text/plane


h3{
font-size:20px;
Font-weight:bold;
Color: #005A9C;
}


3. The browser makes a request get image/logo.png http/1.1
The server returns a response
http/1.1 OK
DATE:APR 2006 15:32:08 GMT
server:apache/2.0.46 (WIN32)
Connection:keep-alive, close
content-length:1280
Content-type:text/plane


{Binary image data follows}




Appendix
1.HTTP specification: The Internet Engineering Organization (IETF) publishes RFC-specified Internet standards, which are widely accepted by Internet research and development agencies. Because they are standard documents, they are generally written in regular languages, as in the case of legislative wenbiao.
Once the 2.RFC:RFC is presented, it is numbered and will not change, and when a standard is modified, a new RFC is given. As a standard, the RFC is widely used on the Internet.
Several important RFCs for 3.HTTP:
RFC1945 HTTP 1.0 Description
RFC2068 HTTP 1.1 Preliminary description
RFC2616 HTTP 1.1 Standard
4. Resource Identifier URI (Uniform Resource Identifter,uri)




HTTP reference


An HTTP Code response code consists of three-bit decimal digits that appear in the first line of the response sent by the HTTP server.


The response code is divided into five types, denoted by their first digit:
1.1XX: Information, request received, continue processing
2.2XX: Success, Behavior is successfully accepted, understood and adopted
3.3XX: Redirect, in order to complete the request, the action must be further performed
4.4XX: Client error, request contains syntax error or request cannot be implemented
5.5XX: Server error, server cannot implement an apparently invalid request


The following table shows each response code and what it means:
100 continue
101 Group Exchange Association
OK
201 was created
202 are adopted
203 Non-authorised information
204 No Content
205 Resetting Content
206 part of the content
300 + Options
301 Permanently transmitted
302 found
303 See other
304 not changed
305 using Proxies
307 Temporary redirection
400 Error request
401 Not authorized
402 Request for payment
403 Forbidden
404 Not Found
405 Methods not allowed
406 Not Accepted
407 Request for Proxy authorization
408 Request timed out
409 conflicts
410 Out of date
411 length of the requirement
412 Premise not established
413 The request instance is too large
414 Request URI too large
415 Unsupported media types
416 The range of requests that cannot be satisfied
417 Expectations of failure
500 Internal Server Error
501 Not used
502 Gateway Error
503 Unavailable Services
504 Gateway Timeout
505 HTTP version is not supported


The HTTP header header is composed of a primary key/value pair. They describe the properties of the client or server, the resources being transferred, and the connection that should be implemented.


Four different types of head labels:
1. Generic header: Can be used for a request or for a response, and is associated with a transaction as a whole rather than a specific resource.
2. Request Header: Allows the client to pass information about itself and the form of the desired response.
3. Response header: The server and the response to transmit its own information.
4. Entity Header: Defines the information for the transmitted resource. Can be used for requests or for responses.


Header Format:<name>:<value><crlf>


The following table describes the headers used in the http/1.1
Accept defines the types of media that clients can handle, sorted by priority;
In a comma-delimited list, you can define multiple types and use wildcard characters. Example: accept:image/jpeg,image/png,*/*
Accept-charset defines the character sets that the client can handle, sorted by priority;
In a comma-delimited list, you can define multiple types and use wildcard characters. Example: Accept-charset:iso-8859-1,*,utf-8
Accept-encoding defines the encoding mechanism that the client can understand. Example: accept-encoding:gzip,compress
Accept-language defines a list of natural languages that the client is willing to accept. Example: Accept-language:en,de
Accept-ranges a response header that allows the server to indicate that it will accept requests for the resource component at a given offset and length.
The value of the header is understood as the unit of measure for the requested range. such as Accept-ranges:bytes or Accept-ranges:none
Age allows the server to specify the length of time, in seconds, that has elapsed since the server generated the response.
The header is used primarily for caching responses. Example: age:30
Allow a response header that defines a list of HTTP methods supported by the Zhiyuan in the request URI. Example: Allow:get,put
AUTHORIZATION a response header that defines the authorization (domain and encoded user ID and password) necessary to access a resource.
Example: Authorization:basic yxv0ag9yonboaww=
Cache-control a generic header for defining cache directives. Example: cache-control:max-age=30
Connection a generic header that indicates whether the socket connection is saved as open. Example: Connection:close or Connection:keep-alive
Content-base an entity header that defines a base URI in order to resolve relative URLs within the entity scope.
If you do not define a content-base header to resolve relative URLs, use the content-location URI (present and absolute) or use a URI request.
Example: content-base:http://www.myweb.com
Content-encoding a type of media modifier that indicates how an entity is encoded. Example: Content-encoding:zip
The content-language is used to specify the natural language type of the data in the input stream. Example: Content-language:en
CONTENT-LENGTH Specifies the length of the bytes contained in the request or response data. Example: content-length:382
CONTENT-LOCATION Specifies the resource location (URI) contained in the request or response.
If it is a must. To the URL it also acts as a starting point for the relative URL of the parsed entity.
Example: Content-location:http://www.myweb.com/news
A MD5 Digest of the CONTENT-MD5 entity used as a checksum.
Both the sender and the receiver calculate the MD5 summary, and the recipient compares the value that it calculates with the value passed in this header.
Example: content-md5: <base64 of MD5 digest>
The Content-range is sent along with some entities, indicating the low and high byte offsets of the inserted bytes, as well as the total length of the entity.
Example: content-range:1001-2000/5000
Contern-type indicates the MIME type of the entity being sent or received. Example: content-type:text/html
Date when the HTTP message was sent. Example: DATE:MON,10PR 18:42:51 GMT
ETag an entity header that assigns a unique identifier to the resource being sent.
For resources that can use multiple URL requests, the etag can be used to determine whether the actual resource being sent is the same resource.
For example: ETag: "208f-419e-30f8dc99"
Expires the validity period of the specified entity. Example: expires:mon,05 Dec 12:00:00 GMT
form a request header, given the e-mail address of the human user who controls the user agent. For example: from: [email protected]
Host name of the resource being requested by host. This domain is mandatory for requests that use http/1.1. Example: Host:www.myweb.com
If-modified-since If a GET request is included, the request is conditionally dependent on the last modified date of the resource.
If this header appears, and the resource has been modified since the specified date, a 304 response code should be reversed.
Example: IF-MODIFIED-SINCE:MON,10PR 18:42:51 GMT
If-match if included in a request, specify one or more entity tags. Only the resources whose etag is labeled with the list are sent.
Example: If-match: "208f-419e-308dc99"
If-none-match If a request is included, specify one or more entity tags. The resource's etag does not match any one of the criteria in the list, and the operation executes.
Example: If-none-match: "208f-419e-308dc99"
If-range an entity tag for the specified resource, and the client already has a copy of this resource. Must be used in conjunction with the range header.
If this entity has not been modified since it was last retrieved by the client, the server sends only the specified range, otherwise it will send the entire resource.
Example: Range:byte=0-499<crlf>if-range: "208f-419e-30f8dc99"
If-unmodified-since This entity is returned only if the requested entity has not been modified since the specified date.
Example: IF-UNMODIFIED-SINCE:MON,10PR 18:42:51 GMT
last-modified Specifies the date and time that the requested resource was last modified. Example: LAST-MODIFIED:MON,10PR 18:42:51 GMT
Location for a resource that has already been moved, it is used to redirect the requestor to another position.
Used in conjunction with status Code 302 (temporary movement) or 301 (permanent move).
Example: location:http://www2.myweb.com/index.jsp
Max-forwards a request header for the trace method to specify the maximum number of proxies or gateways that the request is routed through the gateway.
The proxy or gateway should reduce this number before passing the request. Example: Max-forwards:3
Pragma a generic header that sends implementation-related information. Example: Pragma:no-cache
Proxy-authenticate is similar to Www-authenticate, which is intentionally requesting authentication from the next server only from the request chain (proxy).
Example: Proxy-authenticate:basic realm-admin
Proxy-proxy-authorization is similar to authorization, but does not intentionally deliver anything that is further than the instant server chain.
Example: Proxy-proxy-authorization:basic yxv0ag9yonboaww=
The public list shows the set of methods supported by the server. Example: Public:options,mget,mhead,get,head
range specifies the offset range of a unit of measure and a partially requested resource. Example: range:bytes=206-5513
Refener a Request header field that indicates the initial resource that generated the request. For an HTML form, it contains the address of the Web page for this form.
Example: refener:http://www.myweb.com/news/search.html
Retry-after a response header domain that is sent by the server in conjunction with a status code of 503 (unable to provide a service) to indicate how long to wait before requesting a request again.
This time can be a date, or it can be a unit of seconds. Example: retry-after:18
Server A header that identifies the Web server software and its version number. Example: server:apache/2.0.46 (WIN32)
Transfer-encoding a generic header that indicates the type of transformation that corresponds to the message body that is reversed by the receiving party. Example: transfer-encoding:chunked
Upgrade allows the server to specify a new protocol or a new protocol version, which is used in conjunction with response Encoding 101 (switching protocol).
Example: upgrade:http/2.0
User-agent defines the type of software used to generate the request (typically, such as a Web browser).
For example: user-agent:mozilla/4.0 (compatible; MSIE 5.5; Windows NT; Digext)
Vary a response header that is used to indicate that a response entity is selected from the available response representations using server-driven negotiation. Example: Vary: *
Via a generic header that contains all intermediate hosts and protocols to satisfy the request. Example: via:1.0 fred.com, 1.1 wilma.com
The Warning is used to provide a response header for response status supplemental information. Example: warning:99 www.myweb.com Piano needs tuning
Www-authenticate a response header that prompts the user agent to provide a user name and password, which is used in conjunction with status Code 401 (not authorized). Responds to an authoritative header.
Example: Www-authenticate:basic realm=zxm.mgmt

HTTP protocol parsing for Android development

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.