HTTP Headers parsing

Last Update:2016-10-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is an HTTP Headers? What does it contain? Using the Requests.get () function to request a watercress reading, the returned r.headers is as Follows:

>>>ImportRequests>>> r = Requests.get ('https://book.douban.com/')>>>r.headers{'x-powered-by-ads':'chn-shads-4-12','x-xss-protection':'1; Mode=block','X-dae-app':' book','x-content-type-options':'Nosniff','content-encoding':'gzip','transfer-encoding':'chunked','Set-cookie':'bid=zt-mrsxmmx0; expires=mon, 23-oct-17 03:11:40 GMT; domain=.douban.com; path=/, __ads_session=2yy49z4ezghqoiuo9ga=; domain=.douban.com; path=/','Expires':'Sun, 1 Jan 2006 01:00:00 GMT','Vary':'accept-encoding','X-dae-node':'nain3','Server':'adsserver/44619','X-douban-mobileapp':'0','Connection':'keep-alive','Pragma':'No-cache','Cache-control':'must-revalidate, no-cache, Private','Date':'Sun, Oct 03:11:40 GMT','strict-transport-security':'max-age=15552000;','X-douban-newbid':'zt-mrsxmmx0','Content-type':'text/html; Charset=utf-8'}

' Content-type ': ' text/html; Charset=utf-8 '

This is the mime-type of the document, and the browser determines how the document will be parsed based on this parameter. For example, an HTML page return value Is:

' text/html; Charset=utf-8 ';

/previous ' text ' indicates the document type,/followed by ' HTML ' to represent the subtype of the document;

If it is a picture, it returns: ' Image/jpeg ', which indicates that the target is an image, in particular a JPEG image.

>>> p = requests.get ('http://pic33.nipic.com/20130916/3420027_192919547000_2.jpg'  )>>> p.headers['content-type'image/ JPEG'

If it is a PDF document, return ' application/pdf '

Import requests>>> s = requests.get ('http://www.em-consulte.com/showarticlefile/738819/ Main.pdf')>>> s.headers['content-type'] ' application/pdf '

There are more than three types of mime-type, and there are other types of mime-type that can be seen here.

' Cache-control ': ' must-revalidate, no-cache, private '

The Cache control field is used to specify the instructions that all caching mechanisms must obey throughout the Request/response. Common values are: private (default), no-cache, max-age, must-revalidate

Cache instruction	Description
Public	All content will be cached (both client and proxy servers can be cached)
Private	Content is cached only in the private cache (client cacheable, Proxy server Unavailable)
No-cache	You must confirm with the server whether the returned corresponding changes have been made before you can use the response to satisfy the request for the same URL
Must-revalidate	If the cache content fails, the request must be sent to the Server/proxy for re-authentication
max-age=***	Cached content will expire after ***s

Caching technology can reduce the server load, reduce network congestion, The basic idea is to take advantage of customer access time locality (temporary Location) principle, The customer access to the content in the cache to put a copy, so that if the content is again accessed, there is no need to send a request again, and take the page directly from the Cache. This mechanism is good, but it can also cause problems: 1) the user may again request to obtain the content is outdated content; 2) if the cache fails, the Client's access delay will increase rather than the direct request;

' Expires ': ' Sun, 1 Jan 2006 01:00:00 GMT '

Expires (term) provides a date and time that the response is considered invalid after that date and Time. Data can be obtained from the cache before the expiration time, without the need to request Again.

It has a drawback is that it returns the server side of the time, if the client time and server side time is different or very different, then the error is relatively large. With Cache-control's max-age, This expires function is Replaced.

The Cache-control priority is higher than expires, which is used to indicate the current Resource's validity period. But Cache-control's settings are more Nuanced.

The process of the first request:

The process of the second request:

' Content-encoding ': ' gzip '

1. The client sends an HTTP request to the server with Accept-encoding:gzip in the request, deflate (tells the server browser to support gzip Compression)

2. After the server receives the request, it generates the original response, which has the original content-type, content-length

3. The server encodes the response via gzip, the encoded headers has content-type, content-length, and content-encoding:gzip, Then send this response to the client

4. After the client receives the response, the response is decoded according to Content-encoding:gzip to obtain the uncompressed Response.

' Set-cookie ': ' bid=zt-mrsxmmx0; expires=mon, 23-oct-17 03:11:40 GMT; domain=.douban.com; path=/, __ads_session=2yy49z4ezghqoiuo9ga=; domain=.douban.com; path=/'

A cookie is a piece of ASCII text that the Web server sends to the client, and once a cookie is received, the browser stores the Cookie's information fragment as a "name/value" pair (name-value pairs). after that, Whenever a new document is requested from the same Web server, the Web server changes back to the cookie that was stored locally before it was Sent. The first purpose of creating a cookie is to allow the Web server to track customers through multiple HTTP Requests.

Reference:

[1] Browser HTTP protocol caching mechanism detailed: https://my.oschina.net/leejun2005/blog/369148

[2] cache mechanism in HTTP request: http://blog.chinaunix.net/uid-11639156-id-3214858.html

[3] HTTP Compression content-encoding:gzip:http://liuviphui.blog.163.com/blog/static/20227308420141843933379/

[4] Complete Set-cookie head: http://blog.sina.com.cn/s/blog_70c4d9410100z3il.html

HTTP Headers parsing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

HTTP Headers parsing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

HTTP Headers parsing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support