HTTP Header parsing

Source: Internet
Author: User
Tags http etag ranges

HTTP Header parsing

Reprint please specify source: HTTP header parsing

Article Directory

1. Web server associated with the HTTP protocol

2.HTTP Header

Web server associated with the HTTP protocol

Before explaining the HTTP header knowledge, it is necessary to first understand the Web server that works with HTTP.

Implementing multiple domain names with a single host

The http/1.1 specification explicitly proposes that a single Web server be allowed to implement multiple domain names. Even if there is only one server at the physical level, you can assume that you already have multiple servers by using the functionality of a virtual host (also known as a virtual server).

Virtual host, which can also be called Virtual Server, is a technology that can run multiple Web sites or services on a single host cluster and implement multi-domain Services. Refer to Wiki for specific content. Portal: wiki: Virtual Host

However, there is a problem deploying multiple site domains on the same server, because there is only one physical server, which means that there is only one IP address, after the DNS service resolves the domain name to an IP address, after receiving the request, you need to figure out which domain to access.

There are actually two ways to solve this problem. The first is to request the host name by adding the Host field in the header field when the request is sent. The second is to use a different IP address on a single server to manage multiple services.

Communication Data Forwarding program: Agent

The proxy server is located between the server and the client, receives the request sent by the client and forwards it to the server, and also receives the response returned by the server and forwards it to the client. Front-end engineers commonly used to grab the package software fiddler, Charles is through the agent to achieve the clutch.

The basic behavior of the proxy server is to receive the request sent by the client and forward to the server, the agent does not change the request URI, will be sent directly to the front of the target server with resources. The server holding the resource entity is called the source server, and the response returned from the source server passes through the proxy server before being passed to the client. Each time a proxy server is passed, the VIA header field information is appended to indicate the proxy server information passed. Otherwise I don't know who's with whom.

In general, using a proxy server has the following benefits

1. Reduce server network bandwidth consumption with cache technology

2. Access control for specific websites (control which sites have access to the server, which are inaccessible, and enable access filtering)

The agent has a variety of usage methods, according to the two benchmark classifications, one is whether the cache (caching proxy), and the other is whether to modify the message (transparent proxy). Detailed information can be found in the wiki. Portal: wiki: Proxy Server

Save a cache of resources

The caching technique described above refers to a copy of a resource that is saved within a proxy server or client local disk. With caching, you can reduce access to the source server (read non-expiring cache resources from a proxy server or browser), thus saving traffic and communication time.

The advantage of caching (proxy) servers is that caching is used to avoid requesting resources from the source server multiple times. So the client can get resources from the browser or proxy server, and the source server does not have to process the same request more than once.

However, either the browser or the resource cached on the proxy server, there is a case of cache expiration. If the cache is not expired, the cache resource can be read directly, and if the cache expires, the proxy server will again get the updated resources from the source server. Instead of launching a request to the server immediately, the browser initiates a conditional GET request (If-modified-since and Last-modified fields).

A little summary.

1. A Web server can be configured with multiple domain names, and when requested, add the host field to indicate the requested hostname or multiple IPs to manage different services.

2. The basic behavior of the proxy server is to forward the request sent by the client to the server, and then directly forward the request resource directly to the source server. You can use a proxy server or a browser to cache responses, reducing the waste of bandwidth resources generated by the same request's access to the source server.

HTTP Header

The header fields common to the request headers and the response headers are the Common header field, the Entity header field, and the other header fields. The first field that is specific to the request header is the header field of the request , and the first field that is specific to the response header is the header field. The following is a http/1.1 definition of 47 header fields.




Here's a quick explanation of each field.

http/1.1 General Header Field

The generic header field refers to the header used by both the request message and the response message.

Cache-control directive:

Ability to control the working behavior of the cache. The parameters of the directive are optional, and multiple directives are separated by ', '. The Cache-control directive can be used when requesting and responding.


Public : Cache response instructions. Clearly indicates that other users can also take advantage of the cache.

Private: Cache response instruction. Indicates that the response is only for a specific user as the object, the proxy server will only provide caching resources to specific users, and for other users to send past requests, the proxy server will not return the cache.

No-cache: The purpose is to prevent expired resources from being returned from the cache. A request sent by a client that contains a No-cache directive indicates that the client will not receive cached responses. The proxy server must then forward the client request to the source server. If the server returns a response that contains the No-cache directive, the proxy server cannot cache the resource. The source server will no longer acknowledge the resource validity presented in the proxy request and prevent it from caching the response resource.

No-store: prohibits proxy server from caching response resources.

S-maxage: indicates that the cache expires in a public proxy server without exceeding the specified time. This directive has no effect on servers that return a response repeatedly to the same user. In addition, when the s-maxage instruction is used, the processing of the Expires header field and the max-age instruction is ignored directly. For example Cache-Control: s-maxage=600(秒) , if the cache expires in a public proxy server for no more than 10 minutes, the cache resource can be returned.

max-age: the form is Cache-Control: max-age=600(秒) . If the client sends a request that contains the max-age instruction, the client receives the cached resource when the cache expires for no more than the specified time. A max-age value of 0 indicates that the proxy server needs to forward the request to the source server.

When the source server returns a response that contains the Max-age directive, the proxy server does not acknowledge the validity of the resource, while the Max-age value represents the maximum time the resource is saved as a cache.

In cases where the http/1.1 version of the proxy server encounters a simultaneous expires field, the max-age instruction is processed preferentially and the Expires field is ignored.

Min-fresh: requires the proxy server to return a cache resource that has not been at least a specified time. If the Cache-Control: min-fresh=60(秒) specified Min-fresh is 60 seconds, the response within 60 seconds can be returned, and a response of more than 60 seconds cannot be returned.

Max-stale: indicates that the cache expires within a specified time period and the customer will still receive it. If no parameter value is specified, the client will accept the response regardless of how long it takes.

only-if-cached: indicates that the client requires the proxy server to return only if it caches the target resource locally. In other words, the directive requires the proxy server not to reload the response, and does not reconfirm the validity of the resource. Return status Code 504 Gateway Timeout If the local cache of the requesting proxy Server is not responding

must-revalidate: indicates that the proxy server will again verify to the source server that the response cache that is about to be returned is still valid. If the agent cannot connect to the source server to obtain a valid resource again, the proxy server returns the 504 (Gateway Timeout) status code to the client. The requested Max-stale instruction is also ignored.

proxy-revalidate: The proxy server is required to verify the validity of the cached response.

No-transform: The cache cannot change the media type of the entity principal either in the request or in the response header.

Connection

The connection field has the following two functions

control the header fields that are no longer forwarded to the proxy server: The format is as follows Connection: 不再转发的首部字段名 . Within the client send request and server return response, use the connection field to control not forwarding to the proxy server

Persistent Connection: that is Connection: keep-alive . http/1.1 version The default connection is a persistent connection. The client and server only need to establish a TCP connection to make multiple HTTP traffic to each other. Persistent connections will not end until one party explicitly indicates that a TCP connection needs to be disconnected.

Pragma

The header field is defined only as backward compatibility with http/1.0. form as follows Pragma: no-cache . Used only in the response header, indicating that the proxy server cannot cache the response.

The Pargma header field works the same as the No-cache directive, but for compatibility with HTTP protocol versions, the HTTP response header contains the following two fields.

Trailer

The trailer field describes in advance which header fields are recorded after the body of the message. Used primarily for http/1.1 version of chunked transfer encoding.

Transfer-encoding

The transfer-encoding field specifies the encoding method used to transmit the message body, and only the chunked transfer encoding is valid.

HTTP/1.1 200 OK Transfer-Encoding: chunked Connection: keep-alive cfo <--16进制(10进制为3312) ·····3312字节分块数据····· 392 <--16进制(10进制为914) `````914字节分块数据······

In the above example, the Transfer-encoding field value is effectively encoded using chunked transfer, and is partitioned into 3312-byte and 914-byte chunks of data.

Upgrade

The upgrade field detects whether the HTTP protocol and other protocols can communicate using a higher version. This field is used when using the WebSocket protocol, and an HTTP upgrade is used during HTTP communication to upgrade the HTTP protocol to the WEBSOCKET protocol. The server-side return 101 switching protocols status code indicates a successful protocol conversion, at which time full-duplex bidirectional communication can be made using the WebSocket protocol. You can refer to this article for friends who are unfamiliar with websocket. Portal: WebSocket Protocol resolution

Via

The Via field is intended to track the transmission path of the request and response messages between the client and the server. When a message passes through a proxy server or gateway, it attaches its own server's information to the Via field before forwarding it. The Via field is usually used in conjunction with the Max-forwards field. See this article for an explanation of the Max-forwards field. Portal: Max-forwards

Request Header Field

Accept

The Accept field notifies the server that the user agent can handle the media type and the relative priority of the media type. You can use this form of Type/subtype to specify multiple media types at once, with q= to increase the priority of the media type, up to 1.0, minimum 0, and 1.0 for the default value.

Accept:q=1.0 application/json; q=0.8 text/plain; q=0.7 */*

Accept-charset

The Accept-charset field is used to inform the server user agent of the relative precedence of the character set and character set that are supported. In addition, multiple character sets can be specified at once. The same as the Accept field is the available weight Q value to represent the relative priority.

Accept-encoding

The Accept-encoding field is used to inform the server user agent of the relative priority of content encoding and content encoding support. Content encodings include gzip, compress, deflate, identity (default encoding format for compression not performed), and so on.

Accept-language

Accept-language is used to inform the server user agent of natural language sets (Chinese or English), as well as the relative priority of natural language sets, which can be specified in multiple natural language sets at once

Accept-Language: zh-CN,zh;q=0.9,en;q=0.8

Authorization

The authorization field is used to inform the server of the authentication information (certificate) of the user agent. A user agent who typically wants to authenticate with the server will add the field authorization to the request after receiving the returned 401 status code response.

Host

The host field informs the server of the Internet host name and port number of the resource being requested. When a request is sent to the server, the DNS service is used to resolve the domain name to an IP address. If multiple domain names (virtual hosts) are deployed under the same IP address at this time, the server will not be able to understand exactly which domain name corresponds to the request. Therefore, you need to use the host field to explicitly indicate the hostname of the request.

If-none-match
The If-none-match field is used in conjunction with the ETag, and the server processes the request when it is inconsistent with the ETag field value. If consistent, the server side returns 304 Not Modified.

In a typical usage, when a URL is requested, the Web server returns the resource and its corresponding ETag value, which is placed in the HTTP response header.

Etag: "686897696a7c876b7e"

The client can then decide whether to cache the resource and the ETag. Later, if the client wants to request the same URL again, it will send a request containing the saved ETag and If-none-match fields.

If-None-Match: "686897696a7c876b7e"

After a client request, the server may compare the etag of the client and the ETag of the current version of the resource. If the ETag value matches, which means that the resource has not changed, the server sends back a very short response that contains the state of the HTTP "304 not Modified". The 304 status tells the client that its cached version is up-to-date and should use it.
However, if the value of the ETag does not match, which means that the resource is likely to change, then a full response (OK) will be returned, including the contents of the resource, as if the etag was not being used. In this case, the client can replace the previous cached version with the newly returned resource and the new ETag.

If-modified-since

The If-modified-since field is used with the last-modified field of the response header. When the time of the Last-modified field value is after it, indicating that the resource has been updated, the server returns a status code of OK, and when the Last-modified field value is before it, indicating that the resource has not been updated, the server returns 304 not Modified status code. When used in conjunction with the If-none-match field, the If-modified-since field is ignored unless the server does not support the If-none-match field. The if-modified-since is used to confirm the validity of local resources owned by the proxy server or client.

If-range

The If-range field informs the server that if the specified If-range field value and the ETag value of the requested resource have the same time, the response header returned as a range request will contain the Content-range field, which represents the number of range bytes returned. Conversely, all resources are returned. This field is used in conjunction with the Range field.

Proxy-authorization

Proxy-Authorization: Basic dFDGADdjgjadfDSFJ5

After receiving the authentication information sent by the proxy server, the client sends a request containing the header field to inform the server of the information required for authentication.

Referer

The Referer field tells the server the URI of the original resource requested.

Response Header Field

Accept-ranges

The Accpet-ranges field is used to tell the client server whether to handle a range request to specify a resource for a portion of the server side. There are two types of field values that can be specified, which are specified as bytes when the range request is processed, and none is specified instead.

ETag

The server assigns a corresponding ETag value for each resource, and the ETag value needs to be updated when the resource is updated. The ETag field is typically used in conjunction with the If-none-match field. When the ETag value matches the If-none-match value, indicating that the requested resource has not changed, the server returns a 304 not modified status code, or a $ OK status code if it does not match each other. In addition, the etag is divided into strong etag and weak etag, which are distinguished by the presence of "w/" at the beginning of the etag identifier, as

"123456789"   -- 一个强ETag验证符W/"123456789"  -- 一个弱ETag验证符

Refer to the wiki for specific differences. Portal: HTTP ETag

Proxy-authenticate && Www-authenticate
The Proxy-authenticate field sends the authentication information required by the proxy server to the client, usually in conjunction with the Proxy-authorization field.

The Www-authenticate field is used for HTTP access authentication. Typically used in conjunction with the authorization field.

Entity header Field

The Entity header field is the header used in the entity portion of the request message and Response message to supplement the entity-related information such as the update time of the content.
Allow

form as Allow: GET, POST . The Allow field is used to inform the HTTP methods that the client can support. When the server receives an unsupported HTTP method, it is returned as a response with the status code 405 method not allowed.

Content-encoding

This field tells the client server how to encode the content that the body of the entity chooses. The main use of 4 kinds of content encoding method: gzip, compress, deflate, identity.

Content-language && Content-length

Content-language informs the client entity that the body is using a natural language set. Content-length tells the size of the client entity body.

Content-range && Content-type

Content-range tells the client which part of the returned entity conforms to the scope request, which is for the range request. The field value, in bytes, represents the current send part and the entire entity size. Forms such asContent-Range: bytes 5001-10000/10000

Content-type informs the client entity that the media type used is the same as the Accept field.

Expries

The Expries field is used to inform client resources of the expiration time. If the proxy server receives a response with the Expires field, the resource is cached. The cached resource is returned when the same resource is requested and the specified time is not exceeded. When the specified time is exceeded, the proxy server forwards the request to the source server. If you do not want the proxy server to cache resources, you can set the Expires field to be the same as the value of the Date field. On the browser side, when the requested resource expires, the request is not immediately made to the source server, but the conditional request (If-modified-since and last-modifed fields) is initiated first.

When the Expires field encounters the max-age instruction of the Cache-control field, the max-age directive is processed preferentially.

fields for the cookie service

Because HTTP is a stateless protocol, a cookie is required to implement user state management in conjunction with HTTP. A description of the cookie can be seen in this article. Portal: Front-End storage solutions

Resources

1. "Graphic http"

2.MDN Web Docs

3. wikipedia

HTTP Header parsing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.