This article systematically describes HTTP headers in a concise and easy-to-understand manner.
What is HTTP headers?
HTTP is written by "Hypertext Transfer Protocol". This protocol is used throughout the world wide web. Most of the content you see in your browser is transmitted over HTTP. For example, this article.
HTTP headers is the core of HTTP requests. It carries information about the client browser, request page, server, and so on.
Example
When you type a URL in the address bar of your browser, your browser will be similar to the following HTTP request:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1
Host: net.tutsplus.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120
Pragma: no-cache
Cache-Control: no-cache
The first line is called "Request Line". It describes the basic information of the request, and the rest is the HTTP headers.
After the request is complete, your browser may receive the following HTTP response:
HTTP/1.x 200 OK
Transfer-Encoding: chunked
Date: Sat, 28 Nov 2009 04:36:25 GMT
Server: LiteSpeed
Connection: close
X-Powered-By: W3 Total Cache/0.8
Pragma: public
Expires: Sat, 28 Nov 2009 05:36:25 GMT
Etag: "pub1259380237;gz"
Cache-Control: max-age=3600, public
Content-Type: text/html; charset=UTF-8
Last-Modified: Sat, 28 Nov 2009 03:50:37 GMT
X-Pingback: http://net.tutsplus.com/xmlrpc.php
Content-Encoding: gzip
Vary: Accept-Encoding, Cookie, User-Agent
<!-- ... rest of the html ... -->
The first line is called "status line", which is followed by HTTP headers. When the empty line is complete, the output starts (in this case, some HTML output is used ).
However, HTTP headers cannot be seen when you view the page source code, although they are transmitted to the browser together with what you can see.
This HTTP request also sends receiving requests for some other resources, such as examples, CSS files, and JS files.
Let's take a look at the details.
How to view HTTP headers
The following Firefox extensions can help you analyze HTTP headers:
1. firebug
2. Live HTTP headers
3. In PHP:
- Getallheaders () is used to obtain the request header. You can also use the $ _ server array.
- Headers_list () is used to obtain the response header.
The following are some examples of using PHP.
HTTP request Structure
The first line, known as "first line", contains three parts:
- "Method" indicates the type of request. The most common request types include get, post, and head.
- "Path" represents the path after the host. For example, when you request "pipeline, path will be"/tutorials/other/top-20-mysql-best-practices /".
- "Protocol" contains "HTTP" and version numbers. modern browsers use 1.1.
The remaining part of each line is a "Name: Value" pair. They contain a variety of information about requests and your browser. For example, "User-Agent" indicates your browser version and the operating system you use ." Accept-encoding "will tell the server that your browser can accept compressed output similar to gzip.
Most of these headers are optional. HTTP requests can even be simplified as follows:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1
Host: net.tutsplus.com
And you can still receive valid responses from the server.
Request type
The three most common request types are get, post, and head. You may be familiar with the first two types of requests during HTML writing.
Get: Get a document
Most of the HTML, images, JS, CSS ,... All requests are sent through the get method. It is the main method to .
For example, to get the nettuts + Article, the first line of HTTP request usually looks like this:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1
Once HTML loading is complete, the browser will send a GET request to get the image, as shown below:
GET /wp-content/themes/tuts_theme/images/header_bg_tall.png HTTP/1.1
The form can also be sent using the get method. The following is an example:
<form action="foo.php" method="GET">
First Name: <input name="first_name" type="text" />
Last Name: <input name="last_name" type="text" />
<input name="action" type="submit" value="Submit" />
</form>
When the form is submitted, the HTTP request will look like this:
GET /foo.php?first_name=John&last_name=Doe&action=Submit HTTP/1.1
...
You can append the form input to the query string to the server.
Post: send data to the server
Although you can use the get method to append data to a URL and send it to the server, it is more appropriate to use post to send data to the server in many cases. It is unrealistic to send a large amount of data through get, which has certain limitations.
It is common practice to use post requests to send form data. Let's take the above example and convert it to the post method:
<form action="foo.php" method="POST">
First Name: <input name="first_name" type="text" />
Last Name: <input name="last_name" type="text" />
<input name="action" type="submit" value="Submit" />
</form>
The following HTTP request is created when you submit this form:
POST /foo.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost/test.php
Content-Type: application/x-www-form-urlencoded
Content-Length: 43
first_name=John&last_name=Doe&action=Submit
There are three points to note:
- The path of the first line has become simple/Foo. php, and no query string is available.
- Added the Content-Type and content-lenght headers, which provide information about the sent messages.
- All data is sent as query strings after headers.
POST requests can also be used in Ajax, applications, curl... . In addition, all file upload forms are required to use the post method.
Head: receive header information
The head is similar to get, except that the head does not accept the content of the HTTP response. When you send a head request, it means you are only interested in the HTTP header, not the document itself.
This method allows the browser to determine whether the page has been modified to control the cache. You can also determine whether the requested document exists.
For example, if your website has many links, you can simply send HEAD requests to them to determine whether a dead chain exists. This is much faster than get.
HTTP Response Structure
After the browser sends an HTTP request, the server will respond to the request through an HTTP response. If you don't care about the content, the request looks like this:
The first valuable information is the protocol. Currently, all servers use HTTP/1.x or HTTP/1.1.
The following brief information indicates the status. Code 200 means that our request has been successfully sent, and the server will return the requested document after the header information.
We have all seen the 404 page. When I request a non-existent path from the server, the server uses 404 instead of 200 to respond to us.
The remaining response content is similar to the HTTP request. This content is about the server software, when the page/file has been modified, MIME type, etc...
Likewise, this header information is optional.
HTTP status code
- 200 indicates that the request is successful.
- 300 to indicate redirection.
- 400 indicates a request error.
- 500 indicates a server error.
200 success (OK)
As mentioned above, 200 indicates that the request is successful.
Part 1 (partial content)
If an application only requests files within a certain range, 206 is returned.
This is usually used for download management, resumable upload or multipart download.
404 not found)
Easy to understand
401 unauthorized)
The password-protected page returns this status. If you do not enter the correct password, you will see the following information in the browser:
Note that this is only a password-protected page. The pop-up box for requesting a password is as follows:
403 Forbidden)
If you do not have the permission to access a page, 403 is returned. This usually happens when you try to open a folder without index pages. If the server does not allow you to view the directory content, you will see the 403 error.
Some other methods also send permission restrictions. For example, you can block them by using IP addresses. This requires some help from htaccess.
order allow,deny
deny from 192.168.44.201
deny from 224.39.163.12
deny from 172.16.7.92
allow from all
302 (or 307) Temporary Movement (moved temporarily) and 301 permanent movement (moved permanently)
These two statuses will appear when the browser is redirected. For example, you use a URL shortening service similar to bit. ly. This is how they know who clicked on their link.
302 and 301 are very similar for browsers, but there are some differences between search engine crawlers. For example, if your website is being maintained, you will redirect the client browser to another address with 302. Search engine crawlers will re-index your pages in the future. However, if you use 301 redirection, this means that you have told the search engine crawler that your website has been permanently moved to a new address.
500 Server Error)
This Code usually appears when the page script crashes. Most CGI scripts do not output error messages to browsers like PHP. If a fatal error occurs, they only send a 500 status code. In this case, you need to view the server error log to troubleshoot the error.
Complete list
You can find the complete HTTP status code description here.
HTTP requests in HTTP headers
Now let's look at some common HTTP request information in HTTP headers.
All the header information can be found in the $ _ server array of PHP. You can also use the getallheaders () function to obtain all header information at a time.
Host
An HTTP request is sent to a specific IP address, but most servers have the ability to host multiple websites under the same IP address. Therefore, the server must know the resources under the domain name requested by the browser.
Host: rlog.cn
This is only the basic host name, including the domain name and sub-domain name.
In PHP, you can view it through $ _ server ['HTTP _ host'] or $ _ server ['server _ name.
User-Agent
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
This header can carry the following information:
- The browser name and version number.
- The operating system name and version number.
- Default language.
This is a general method for websites to collect visitor information. For example, you can determine whether a visitor is using a mobile phone to access your website and then decide whether to direct the visitor to a mobile website that performs well at a low resolution.
In PHP, you can get the User-Agent through $ _ server ['HTTP _ user_agent '].
if ( strstr($_SERVER['HTTP_USER_AGENT'],'MSIE 6') ) {
echo "Please stop using IE6!";
}
Accept-Language
Accept-Language: en-us,en;q=0.5
This information indicates the user's default language settings. If the website has different language versions, you can use this information to redirect your browser.
It can be separated by commas to carry multiple languages. The first language will be the preferred language. Other languages will carry a "Q" value to indicate the user's preference for the language (0 ~ 1 ).
Use $ _ server ["http_accept_language"] in PHP to obtain this information.
if (substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2) == 'fr') {
header('Location: http://french.mydomain.com');
}
Accept-Encoding
Accept-Encoding: gzip,deflate
Most modern browsers support gzip compression and report this information to the server. Then the server sends the compressed HTML to the browser. This reduces the file size by nearly 80% to save download time and bandwidth.
You can use $ _ server ["http_accept_encoding"] in PHP to obtain this information. The value is automatically detected when the ob_gzhandler () method is called, so you do not need to manually detect the value.
// enables output buffering
// and all output is compressed if the browser supports it
ob_start('ob_gzhandler');
If-modified-since
If a page has been cached in your browser, the browser will check whether the document has been modified next time, and then it will send a header like this:
If-Modified-Since: Sat, 28 Nov 2009 06:38:19 GMT
If it has not been modified since this time, the server will return "304 not modified" and no content will be returned. The browser will automatically read content from the cache
In PHP, you can use $ _ server ['HTTP _ if_modified_since '] for detection.
// assume $last_modify_time was the last the output was updated
// did the browser send If-Modified-Since header?
if(isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
// if the browser cache matches the modify time
if ($last_modify_time == strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
// send a 304 header, and no content
header("HTTP/1.1 304 Not Modified");
exit;
}
}
An HTTP header called etag is used to determine whether the cached information is correct. We will explain it later.
Cookie
As the name suggests, it will send the cookie information stored in your browser to the server.
Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120; foo=bar
It is a group of name-value pairs separated by semicolons. The cookie can also contain the session ID.
In PHP, a single cookie can be obtained by accessing the $ _ cookie array. You can directly use $ _ Session array to obtain the session variable. If you need a session ID, you can use the session_id () function to replace the cookie.
echo $_COOKIE['foo'];
// output: bar
echo $_COOKIE['PHPSESSID'];
// output: r2t5uvjq435r4q7ib3vtdjq120
session_start();
echo session_id();
// output: r2t5uvjq435r4q7ib3vtdjq120
Referer
As the name suggests, the header will contain referring URL Information.
For example, if I access the nettuts + homepage and click a link, the header information will be sent to the browser:
Referer: http://net.tutsplus.com/
In PHP, you can get this value through $ _ server ['HTTP _ referer.
if (isset($_SERVER['HTTP_REFERER'])) {
$url_info = parse_url($_SERVER['HTTP_REFERER']);
// is the surfer coming from Google?
if ($url_info['host'] == 'www.google.com') {
parse_str($url_info['query'], $vars);
echo "You searched on Google for this keyword: ". $vars['q'];
}
}
// if the referring url was:
// http://www.google.com/search?source=ig&hl=en&rlz=&=&q=http+headers&aq=f&oq=&aqi=g-p1g9
// the output will be:
// You searched on Google for this keyword: http headers
You may have noticed the word "referrer" is misspelled as "Referer". Unfortunately it made into the official HTTP specifications like that and got stuck.
Authorization
When a page requires authorization, the browser will pop up a login window. After you enter the correct account, the browser will send an HTTP request, but it will include a header like this:
Authorization: Basic bXl1c2VyOm15cGFzcw==
The information contained in the header is base64 encoded. For example, base64_decode ('bxl1c2vyom15cgfzcw = ') is converted to 'myuser: mypass '.
In PHP, this value can be obtained using $ _ server ['php _ auth_user '] and $ _ server ['php _ auth_pw.
For more details, refer to the www-authenticate section.
HTTP Response in HTTP headers
Now let me know some common HTTP Response Information in HTTP headers.
In PHP, you can set the header response information through header. PHP has automatically sent some necessary header information, such as loaded content, setting cookies, etc... You can use the headers_list () function to view the sent and to-be-sent headers. You can also use the headers_sent () function to check whether the header information has been sent.
Cache-control
The definition of w3.org is: "The cache-control general-header field is used to specify directives which must be obeyed by all caching mechanisms along the request/response chain. "among them," caching mechanisms "contains some gateway and proxy information that your ISP may use.
For example:
Cache-Control: max-age=3600, public
"Public" means that the response can be cached by anyone. "Max-Age" indicates the number of seconds that the cache is valid. This allows your website to be cached to reduce the download time and bandwidth, and increases the loading speed of your browser.
You can also disable caching by setting the "no-Cache" command:
Cache-Control: no-cache
For more details, see w3.org.
Content-Type
This header contains the "mime-type" of the document ". The browser determines how to parse the document based on this parameter. For example, an HTML page (or a PHP page with HTML output) will return something like this:
Content-Type: text/html; charset=UTF-8
'Text' indicates the document type, while 'html 'indicates the document subtype. This header also contains more information, such as charset.
If it is an image, a response like this will be sent:
Content-Type: image/gif
The browser can use MIME-type to determine whether to open the document using external programs or its own extensions. In the following example, Adobe Reader is called:
Content-Type: application/pdf
Load directly. Apache usually automatically determines the mime-type of the document and adds the appropriate information to the header. Most browsers have a certain degree of fault tolerance. If the header does not provide or the information is incorrectly provided, it will automatically detect the mime-type.
You can find a list of common mime-types here.
In PHP, you can use finfo_file () to detect the ime-type of the file.
Content-Disposition
This header tells the browser to open a file download window, instead of trying to parse the response content. For example:
Content-Disposition: attachment; filename="download.zip"
This will cause the browser to display such a dialog box:
Note that the Content-Type header information suitable for it will also be sent
Content-Type: application/zip
Content-Disposition: attachment; filename="download.zip"
Content-Length
When the content is to be transmitted to the browser, the server can use this header to inform the browser of the size (bytes) of the file to be transmitted ).
Content-Length: 89123
This information is quite useful for file downloads. This is why the browser knows the download progress.
For example, I wrote a virtual script to simulate a slow download.
// it's a zip file
header('Content-Type: application/zip');
// 1 million bytes (about 1megabyte)
header('Content-Length: 1000000');
// load a download dialogue, and save it as download.zip
header('Content-Disposition: attachment; filename="download.zip"');
// 1000 times 1000 bytes of data
for ($i = 0; $i < 1000; $i++) {
echo str_repeat(".",1000);
// sleep to slow down the download
usleep(50000);
}
The result will be as follows:
Now, I comment out the Content-Length header:
// it's a zip file
header('Content-Type: application/zip');
// the browser won't know the size
// header('Content-Length: 1000000');
// load a download dialogue, and save it as download.zip
header('Content-Disposition: attachment; filename="download.zip"');
// 1000 times 1000 bytes of data
for ($i = 0; $i < 1000; $i++) {
echo str_repeat(".",1000);
// sleep to slow down the download
usleep(50000);
}
The result is as follows:
This browser will only tell you how much you have downloaded, but will not tell you how much you need to download in total. The progress bar does not display the progress.
Etag
This is another header generated for caching. It looks like this:
Etag: "pub1259380237;gz"
The server may respond to the browser with each sent file. This value can contain the last modification date, file size, or file checksum. Browsing caches it with the received document. The following HTTP request will be sent when the browser requests the same file again:
If-None-Match: "pub1259380237;gz"
If the requested document etag value is the same as it, the server will send a 304 status code instead of 2OO. And no content is returned. The browser loads the file from the cache.
Last-modified
As the name suggests, this header indicates the last modification time of the document in GMT format:
Last-Modified: Sat, 28 Nov 2009 03:50:37 GMT
$modify_time = filemtime($file);
header("Last-Modified: " . gmdate("D, d M Y H:i:s", $modify_time) . " GMT");
It provides another caching mechanism. The browser may send such a request:
If-Modified-Since: Sat, 28 Nov 2009 06:38:19 GMT
We have discussed in the IF-modified-since section.
Location
This header is used for redirection. If the response code is 301 or 302, the server must send this header. For example, when you access the http://www.nettuts.com, the browser will receive the following response:
HTTP/1.x 301 Moved Permanently
...
Location: http://net.tutsplus.com/
...
In PHP, You can redirect visitors in this way:
header('Location: http://net.tutsplus.com/');
Status Code 302 is sent by default. If you want to send status code 301, write it as follows:
header('Location: http://net.tutsplus.com/', true, 301);
Set-Cookie
When a website needs to set or update its cookie information, it uses the following header:
Set-Cookie: skin=noskin; path=/; domain=.amazon.com; expires=Sun, 29-Nov-2009 21:42:28 GMT
Set-Cookie: session-id=120-7333518-8165026; path=/; domain=.amazon.com; expires=Sat Feb 27 08:00:00 2010 GMT
Each cookie is used as a separate header. Note: Setting the cookie through JS will not be reflected in the HTTP header.
In PHP, you can use the setcookie () function to set the cookie. php will send an appropriate HTTP header.
setcookie("TestCookie", "foobar");
It sends the following header information:
Set-Cookie: TestCookie=foobar
If no expiration time is specified, the cookie will be deleted after the browser is closed.
WWW-Authenticate
A website may send this header to verify the user. When the browser sees this response in the header, a pop-up window is opened.
WWW-Authenticate: Basic realm="Restricted Area"
It will look like this:
In the chapter of the PHP manual, a simple piece of code demonstrates how to do this with PHP:
if (!isset($_SERVER['PHP_AUTH_USER'])) {
header('WWW-Authenticate: Basic realm="My Realm"');
header('HTTP/1.0 401 Unauthorized');
echo 'Text to send if user hits Cancel button';
exit;
} else {
echo "<p>Hello {$_SERVER['PHP_AUTH_USER']}.</p>";
echo "<p>You entered {$_SERVER['PHP_AUTH_PW']} as your password.</p>";
}
Content-Encoding
This header is usually set when the returned content is compressed.
Content-Encoding: gzip
In PHP, if you call the ob_gzhandler () function, this header will be automatically set.
This entry was posted on Thursday, December 3 rd, 2009 at pm and is filed under digest translation, Reading Notes. you can follow any responses to this entry through the RSS 2.0 feed. you can leave a response, or trackback from your own site.
25 responses to "HTTP headers entry"