HTTP protocol (2)--------Network programming

Source: Internet
Author: User
Tags response code send cookies time and date browser cache

1. HTTP request Format

People who have done socket programming know that when we design a communication protocol, the "message Header/Message body" segmentation method is very common, the message header tells the other party what the message is doing, the message body tells the other how to do. HTTP protocol transmission of the message is also stipulated, each HTTP packet is divided into HTTP header and HTTP body two parts, the message body is optional, and the message header is required. Whenever we open a webpage, right click on the above, select "View Source file", then see the HTML code is HTTP message body, then the message header can be seen through the browser development tool or plug-in, if Firefox's Firebug,ie HttpWatch.

The client requests access to the resource to the server by sending an HTTP request. It passes a block of data to the server, which is the request information, and the HTTP request consists of three parts: the request line, the request header, and the request body.

Request Line: Request method URI Protocol/version

Requests header (Request header)

Request Body

The following is the data for an HTTP request:

post/index.php http/1.1host:localhostuser-agent:mozilla/5.0 (Windows NT 5.1; rv:10.0.2) gecko/20100101 Firefox/ 10.0.2accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8accept-language:zh-cn,zh;q= 0.5accept-encoding:gzip, deflateconnection:keep-alivereferer:http://localhost/ content-length:25content-type:application/x-www-form-urlencoded username=aa&password=1234
1. Request line: The first line of the request method URI Protocol/version request is "method URL Protocol/version" and ends with a carriage return line. The request lines are separated by a space.        The format is as follows: post/index.php http/1.1 the "GET" in the code above represents the request method, "//ndex.php" represents the URI, "http/1.1 represents the version of the Protocol and Protocol. HTTP requests can use a variety of request methods, depending on the HTTP standard. For example: HTTP1.1 supports 7 methods of request: GET, POST, HEAD, OPTIONS, PUT, delete, and Tarce.         In Internet applications, the most common method is get and post.   The URL completely specifies the network resource to be accessed, usually with a relative directory relative to the root of the server, always beginning with a "/", and finally, the version of the Protocol that declares the use of HTTP during communication. Request method

In the HTTP protocol, HTTP requests can use a variety of request methods that indicate how to access the resources identified by Request-uri. The request methods supported by HTTP1.1 are shown in the following table:

How to request in HTTP1.1:
Method Role
GET Request to get the resource identified by Request-uri
POST The requesting server receives the entity encapsulated in the request as part of the resource identified by Request-uri in Request-line
HEAD Request for a response message header for a resource identified by Request-uri
PUT

The request server stores a resource and uses Request-uri as its identifier

DELETE Requesting the server to delete resources identified by Request-uri
TRACE Request information to be echoed back to the requesting server, primarily for testing or diagnostics
CONNECT Reserved for future use
OPTIONS Request performance of the query server, or query for resource-related options and requirements

Focus on GET, POST, and HEAD three methods:

(1) GET

The Get method is used to obtain information about the resource identified by Request-uri, and the common form is:

GET Request-uri http/1.1
The Get method is the default HTTP request method, for example, when we access the Web page by entering the URL directly in the address bar of the browser, the browser uses the Get method to obtain resources from the server.

We can use the Get method to submit the form data, the form data submitted with the Get method is simply encoded, and it is sent to the server as part of the URL, so if you use the Get method to submit the form data there is a security risk. For example:
Http://localhost/login.php?username=aa&password=1234

From the URL request above, it is easy to identify what the form submits. (? ) In addition, the amount of data submitted cannot be too large because the data submitted by the Get method is part of the URL request. This is because the browser has a limit on the length of the URL

A variety of browsers will also limit the length of the URL, the following is the URL length limit of several common browsers: (unit: characters)

ie:2803firefox:65536chrome:8182safari:80000opera:190000

(2) POST

The Post method is an alternative to the Get method, which is primarily to submit form data to the Web server, especially large batches of data. After the end of the request header information after the two carriage return (actually an empty line), is the data submitted by the form. As mentioned above, the post form data:

username=aa&password=1234

The Post method overcomes some of the drawbacks of the Get method. When submitting form data through the Post method, the data is not sent as part of the URL request but as standard data to the Web server, which overcomes the drawback that the information in the Get method is not confidential and the amount of data is too small. Therefore, for security reasons and respect for user privacy, the Post method is usually used for form submission.

From a programmatic point of view, if a user submits data through a GET method, the data is stored in the QUERY_STRING environment variable, and the data submitted by the Post method can be obtained from the standard input stream.

the Get and post methods have the following differences:

1, in the client, get way to submit data through the URL, the data can be seen in the URL, post, the data is placed in the body of the HTTP packet.

2, there is a limit to the size of the data submitted by the Get method (because the browser has a limit on the length of the URL), and post does not have this limitation.

3, security issues. As mentioned in (1), when you use Get, the parameters are displayed on the address bar, and Post does not. So, if the data is in Chinese and is non-sensitive, then use get; If the user enters data that is not a Chinese character and contains sensitive data, then it is better to use post.

4., the server value is not the same way. The Get method takes a value, such as PHP can use $_get to get the value of the variable, and post by $_post to get the value of the variable.

(3) HEAD

The head method is almost identical to the GET method, except that the head method simply requests the message header rather than the complete content. For the response part of the head request, the information contained in the HTTP header is the same as the information obtained through the GET request. Using this method, you can obtain information about the resources identified by Request-uri without transmitting the entire resource content. This method is often used to test the validity of hyperlinks, whether they can be accessed, and whether they have been updated recently.

Note that in an HTML document, both Get and post are written, but the get and post in the HTTP protocol can only be uppercase.

2. Request Header

Each header field consists of a domain name, a colon (:), and a domain value of three parts. Domain names are case-insensitive, you can add any number of whitespace before the domain value, and the header field can be expanded to multiple lines, at the beginning of each line, with at least one space or tab.

The most common request headers for HTTP are as follows:

Transport header Field

Connection:

Action: Indicates whether a persistent connection is required.

If the server sees the value here as "keep-alive", or sees that the request is using an HTTP 1.1 (HTTP 1.1 defaults to persistent connection), it can take advantage of the persistent connection, when the page contains multiple elements (such as applets, pictures), Significantly reduce the time it takes to download. To do this, the server needs to send a content-length header in the answer, the simplest implementation is to write the content to Bytearrayoutputstream first, and then calculate its size before formally writing the content;

Example: connection:keep-alive when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, it will continue to use the established connection

For example: Connection:close represents the completion of a request, the TCP connection between the client and the server for transmitting HTTP data is turned off, and the TCP connection needs to be re-established when the client sends the request again.

Host (the header field is required when the request is sent)

The host request header domain is primarily used to specify the Internet host and port number of the requested resource, which is typically extracted from the HTTP URL.

Eg:http://;localhost/index.html
In the request message sent by the browser, the host Request header field is included, as follows:
Host:localhost

The default port number 80 is used here, and if the port number 8080 is specified, it becomes: host:localhost:8080

Client Header Domain

Accept:

Role: The type of media that the browser can accept (MIME type),

For example: accept:text/html represents the type of server postback that the browser can accept as text/html, which is what we often call HTML documents, and if the server cannot return text/html type of data, the server should return a 406 error (non acceptable).

The wildcard character * represents any type. For example, Accept: */* on behalf of the browser can handle all types, (the general browser to the server is the issue of this)

Accept-encoding:

Function: The browser declares itself to receive the encoding method, usually specifies the compression method, whether compression is supported, what compression method is supported (Gzip,deflate), (note: This is not a character encoding);

For example: Accept-encoding:gzip, deflate. The server can return an HTML page with gzip or deflate encoding to a browser that supports Gzip/deflate. In many cases this can reduce download time by 5 to 10 times times and also save bandwidth.

Accept-language:

Role: The browser affirms the language it receives.

Language and Character set differences: Chinese is a language, Chinese has a variety of character sets, such as BIG5,GB2312,GBK and so on;

For example: ACCEPT-LANGUAGE:ZH-CN. If the header field is not set in the request message, the server assumes that the client is acceptable for each language.

User-agent:

Role: tells the HTTP server which client uses the name and version of the operating system and browser.

When we go online to the forum, often see some welcome information, which lists the name and version of your operating system, the name and version of the browser you are using, which often makes a lot of people feel very magical, in fact, The server application obtains this information from the User-agent request header domain user-agent The request header domain allows the client to tell the server about its operating system, browser, and other properties.

For example: user-agent:mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; trident/4.0; CIBA;. NET CLR 2.0.50727;. NET CLR 3.0.4506.2152;. NET CLR 3.5.30729;. net4.0c; infopath.2;. NET4.0E)

Accept-charset:

Role: The browser affirms its own received character set, this is the various character sets and character encodings described earlier in this article, such as gb2312,utf-8 (usually we say CharSet includes the corresponding character encoding scheme);

For example: accept-charset:iso-8859-1,gb2312. If the field is not set in the request message, the default is to accept any character set.

Authorization: Authorization information, which usually appears in the response to the Www-authenticate header sent to the server;

The authorization request header domain is primarily used to prove that a client has permission to view a resource. When a browser accesses a page, if a response code of 401 (unauthorized) is received from the server, a request containing the authorization request header domain can be sent, requiring the server to validate it.

Cookie/login header Field

Cookies:

Role: The most important header, the value of the cookie is sent to the HTTP server

Entity header Field

Content-length

Role: The length of the data sent to the HTTP server. That is, the length of the request message body;

Example: content-length:38

Content-type:

Role:

Example: content-type:application/x-www-form-urlencoded

Miscellaneous header Field

Referer:

Role: The server that provides the context information for the request tells the server which link I have received from, such as linking to a friend from my home page, and his server is able to count the number of users who clicked the link on my page every day from the HTTP referer to visit his website.

Example: REFERER:HTTP://TRANSLATE.GOOGLE.CN/?HL=ZH-CN&TAB=WT

Cache header Field

If-modified-since:

Effect: The last modification time of the browser-side cache page is sent to the server, and the server compares this time with the last modification time of the actual file on the server. If the time is the same, then return 304, the client uses the local cache file directly. If the time is inconsistent, 200 and the new file contents are returned. After the client receives it, it discards the old files, caches the new files, and displays them in the browser.

For example: If-modified-since:thu, 09:07:57 GMT.

If-none-match:

Role: If-none-match works with the ETag and works by adding etag information to the HTTP response. When the user requests the resource again, the If-none-match information (the value of the ETag) is added to the HTTP request. If the server verifies that the etag of the resource has not changed (the resource is not updated), it returns a 304 status that tells the client to use the local cache file.  Otherwise, the 200 state and the new resource and ETag are returned. Using such a mechanism will improve the performance of your website

Example: If-none-match: "03f2b33c0bfcc1:0"

Pragma:

Role: Prevent the page from being cached, in the http/1.1 version, it is identical to the Cache-control:no-cache function

Pargma has only one usage, for example: Pragma:no-cache

Note: In the http/1.0 version, only Pragema:no-cache is implemented, not implemented Cache-control

Cache-control:

Role: This is a very important rule. This is used to specify the caching mechanism that response-request follows. Each instruction has the following meanings

Cache-control:public can be cached by any cache ()

Cache-control:private content is cached only in the private cache

Cache-control:no-cache All content is not cached

2. HTTP response format

After the request message is received and interpreted, the server returns an HTTP response message. Like an HTTP request, the HTTP response is also made up of three parts: the status line, the message header, and the response body. Such as:

http/1.1 Okdate:sun, 08:12:54 gmtserver:apache/2.2.8 (Win32) Php/5.2.5x-powered-by:php/5.2.5set-cookie: PHPSESSID=C0HUQ7PDKMM5GG6OSOE3MGJMM3; Path=/expires:thu, 1981 08:52:00 Gmtcache-control:no-store, No-cache, Must-revalidate, post-check=0, pre-check=0 Pragma:no-cachecontent-length:4393keep-alive:timeout=5, max=100connection:keep-alivecontent-type:text/html; Charset=utf-8

1, status line

The status line consists of the Protocol version, the status code in the form of the number, and the corresponding state description, separated by a space between the elements, and the return line character at the end, in the following format:

Http-version Status-code reason-phrase CRLF

Http-version represents the version of the server HTTP protocol, Status-code represents the response code sent back by the server, Reason-phrase represents the text description of the status code, and CRLF represents a carriage return line break. For example:

http/1.1 OK (CRLF)

Status Codes and status descriptions

The status code consists of 3 digits that indicate whether the request is understood or satisfied, and the status description gives a brief textual description of the status code. The first number of the status code defines the response category, and the following two digits are not categorized. The first number has 5 values, as shown below.

    • 1XX: Indication information-Indicates that the request has been accepted and continues processing
    • 2XX: Success-Indicates that the request has been successfully received, understood, accepted.
    • 3XX: Redirect--further action is required to complete the request
    • 4XX: Client Error--Request syntax error or request not implemented
    • 5XX: Server-side error-the server failed to implement a legitimate request.

Common status codes, status descriptions, descriptions:
$ OK//client request succeeded
Bad Request//client requests have syntax errors and cannot be understood by the server
401 Unauthorized//request unauthorized, this status code must be used with the Www-authenticate header field
403 Forbidden//server receives request, but refuses to provide service
404 Not Found//request resource not present, eg: Wrong URL entered
Internal Server error//server unexpected errors
503 Server Unavailable//server is currently unable to process client requests and may return to normal after some time

2. Response body

The response body is the contents of the resource returned by the server and must be separated by a blank line between the response header and the body. Such as:

  1. <html>
  2. <head>
  3. <title>http Response Example <title>
  4. </head>
  5. <body>
  6. Hello http!
  7. </body>
  8. </html>
3. Response header Information

The most common response headers for HTTP are as follows:

Cache header Field

Date:

Effect: The time and date at which the message was generated, that is, the current GMT time.

Example: Date:sun, 08:12:54 GMT

Expires:

Role: The browser uses the local cache for the specified expiration time, indicating when the document should be considered expired and thus no longer caches it.

Example: Expires:thu, 1981 08:52:00 GMT

Vary

Role:

Example: vary:accept-encoding

Cookie/login header Field

P3p

Role: Used to set cookies across domains, which resolves the issue of cross-domain access to cookies for IFRAME

Example: P3p:cp=cura ADMa DEVa Psao psdo our BUS UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR

Set-cookie

Role: A very important header, used to send cookies to the client browser, each write cookie generates a Set-cookie.

For example: SET-COOKIE:PHPSESSID=C0HUQ7PDKMM5GG6OSOE3MGJMM3; path=/

Entity Body Header field:

Attributes of entity content, including entity information type, length, compression method, last modification time, data validity, and so on.

ETag:

Function: Used in conjunction with If-none-match. (See examples of If-none-match in the section)

For example: ETag: "03f2b33c0bfcc1:0"

Last-modified:

Role: Used to indicate the last modification date and time of the resource. (See examples of if-modified-since in the section)

Example: last-modified:wed, Dec 09:09:10 GMT

Content-type:

Role: The Web server tells the browser the type and character set of the object it responds to.

For example:

content-type:text/html; Charset=utf-8

content-type:text/html;charset=gb2312

Content-type:image/jpeg

Content-length:

Indicates the length of the entity body, expressed as a decimal number stored in bytes. In the process of data downlink, content-length the way to pre-cache all the data in the server, and then all the data peremptorily to the client.

Example: content-length:19847

Content-encoding:

Role: The Encoding (Encode) method of the document. Compression is generally the way.

The Web server indicates what compression method (Gzip,deflate) It uses to compress the objects in the response. Using gzip to compress documents can significantly reduce the download time of HTML documents.

Example: Content-encoding:gzip

Content-language:

Role: The Web server tells the browser to respond to the language of the object

Example: Content-language:da

Miscellaneous header Field

Server:

Function: Indicates the software information of the HTTP server

Example: apache/2.2.8 (WIN32) php/5.2.5

X-powered-by:

Role: Indicates what technology the site is developed with

Example: x-powered-by:php/5.2.5

Transport header Field

Connection:

Example: connection:keep-alive when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, it will continue to use the established connection

For example: Connection:close represents the completion of a request, the TCP connection between the client and the server for transmitting HTTP data is turned off, and the TCP connection needs to be re-established when the client sends the request again.

Location Header Field

Location:

Function: Used to redirect a new location, including a new URL address

For example, see 304 status instances

The HTTP protocol is a stateless and connection:keep-alive difference

Stateless means that the protocol has no memory capacity for transactions, and the server does not know what the client state is. On the other hand, there is no connection between opening a Web page on a server and the pages you have previously opened on this server.

HTTP is a stateless, connection-oriented protocol, and stateless does not mean that HTTP cannot maintain TCP connections, nor does it use the UDP protocol (no connection) on behalf of HTTP.

From http/1.1 onwards, the default is to open the keep-alive, to maintain the connection characteristics, in short, when a Web page opens, the client and server for the transmission of HTTP data between the TCP connection will not be closed, if the client again access to the Web page on this server, will continue to use this established connection.

Keep-alive does not permanently keep the connection, it has a hold time that can be set in different server software (such as Apache).

3. Browser cache

Browser cache: Includes caching of resources such as page HTML cache and picture js,css. For example, the browser cache is based on saving the page information to the user's local computer hard drive.

1, the advantages of caching:

1) The server responds faster: Because requests are made from the cache server (closer to the client) rather than the source server, the process takes less time and makes the server appear to respond faster.

2) Reduce network bandwidth consumption: Reduces client bandwidth consumption when replicas are reused, and customers can save bandwidth costs, increase the need to control bandwidth, and be more manageable.

1. How the Cache works

Page cache status is determined by the HTTP header, a browser request information, and a server response information. Mainly include Pragma:no-cache, Cache-control, Expires, Last-modified, If-modified-since. Among them Pragma:no-cache by http/1.0 stipulation, Cache-control by http/1.1 stipulation.

Working principle diagram:

We can see that the principle is divided into three main steps:

    1. First request: Browser through the header header of HTTP, with Expires,cache-control,last-modified/etag to the server request, at this time the server records the first request of the Last-modified/etag
    2. Request again: When the browser requests again, the request header comes with Expires,cache-control,if-modified-since/etag to the server request
    3. The server compares the first recorded Last-modified/etag with the if-modified-since/etag of the request, determines whether the update is required, and the server uses these two headers to determine that the local resource has not changed and that the client does not need to re-download and return a 304 response. The common process is as follows:

Cache-related HTTP extended message headers

Expires: Set page expiration time, GMT GMT

Cache-control: More granular control over the contents of the cache

Last-modified: The last modification time of the request object is used to determine whether the cache expiration is usually generated by the file's time information

ETag: The check value of the resource in the response, which is uniquely identified on the server. The ETag is a token that can be associated with a web resource, and the last-modified function is not much, but also an identifier, commonly used in conjunction with last-modified, to enhance the accuracy of server judgments.

Date: Time of the server

If-modified-since: The time at which the client accesses the resource last modified to compare to the server-side last-modified

If-none-match: The client accesses the test value of the resource, with the ETag.

Main parameters of Cache-control
The Cache-control:private/public public response is cached and shared among multiple users. Private responses can only be used as private caches and can no longer be shared among users.
Cache-control:no-cache: Do not cache
Cache-control:max-age=x: Cache time in seconds
Cache-control:must-revalidate: If the page is out of date, go to the server to get it.

2, about the picture, Css,js,flash cache

This is done primarily through the configuration of the server, and if you use the Apache server, you can use the mod_expires module to implement:

To compile the Mod_expires module:

Cd/root/httpd-2.2.3/modules/metadata

/usr/local/apache/bin/apxs-i-a-c mod_expires.c//Compile

Edit httpd.conf Configuration: Add the following content

<ifmodule mod_expires.c>

Expiresactive on

ExpiresDefault "Access plus 1 month"

Expiresbytype text/html "Access plus 1 months"

Expiresbytype text/css "Access plus 1 months"

Expiresbytype image/gif "Access plus 1 months"

Expiresbytype image/jpeg "Access plus 1 months"

Expiresbytype image/jpg "Access plus 1 months"

Expiresbytype image/png "Access plus 1 months"

Expiresbytype Application/x-shockwave-flash "Access plus 1 months"

Expiresbytype application/x-javascript "Access plus 1 months"

#ExpiresByType video/x-flv "Access plus 1 months"

</IfModule>

Explanation: The first sentence--opening service

The second sentence-The default time is one months

The following is a cache time setting for various types of resources

HTTP protocol (2)--------Network programming

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.