In-depth understanding of cookies

Source: Internet
Author: User

HTTP cookies, often referred to as "cookies", have been in existence for a long time, but are still not fully understood. The first problem is that there are many misconceptions that cookies are backdoor programs or viruses, or that they do not know how it works. The second problem is the lack of a consistent interface for cookies. Despite these problems, cookies continue to play such an important role in web development that if cookies disappear without alternatives, many of our favorite Web applications will become useless.

The origin of Cookies

One of the biggest problems with early web development is how to manage state. In short, the server side has no way to know whether two requests are from the same browser. The way to do that is to insert a token into the requested page and return the token (to the server) on the next request. This requires inserting a hidden form field with token in the form, or passing the token in the URL's Qurey string. Both of these approaches emphasize manual operation and are extremely error-prone.

Lou Montulli, then an employee of Netscape Communications, was thought to have applied the concept of "magic cookies" to Web Communications in 1994. His intention is to solve the shopping cart in the web, and now all the shopping sites depend on the shopping cart. His earliest documentation provides some basic information on how cookies work. This document is normalized in RFC2109 (this is the reference for all browsers to implement cookies), And eventually gradually formed the Ref2965.montulli was finally granted a US patent on cookies. Netscape Browser has started to support cookies in its first version, and all current Web browsers support cookies.

What is a cookie?

Frankly, a cookie is a small piece of text file stored in the user's host browser. Cookies are plain text, and they do not contain any executable code. A Web page or server tells the browser to store this information and return it to the server based on a series of rules in each subsequent request. The Web server can then use this information to identify the user. Most sites that need to be logged in will usually have a cookie set up in your authentication information and then, as long as the cookie is present and valid, you are free to browse through all parts of the site. Thirdly, the cookie contains only the data, which is not harmful in itself.

Create a cookie

Through the HTTP Set-cookie message header, the Web server can specify that a cookie be stored. The format of the Set-cookie message is the following string (the parts in brackets are optional)

1 Set-Cookie:value [ ;expires=date][ ;domain=domain][ ;path=path][ ;secure]

The first part of the message header, the value section, is usually a name=value-formatted string. In fact, the original manual indicates that this is the format that should be used, but all values of the cookie are not checked by the browser in this format. In fact, you can specify a string that does not contain an equal sign and it will be stored as well. However, the usual way to use this is to specify the value of the cookie in name=value format (and most interfaces only support that format).

When a cookie is present and the optional conditions permit, the value of the cookie is sent to the server in each subsequent request. The value of the cookie is stored in an HTTP message header called a cookie and contains only the value of the cookie, and all other options are removed. For example:

1 Cookie : value

The options specified by Set-cookie are applied only to the browser side and are not retrieved by the server once the option is set. The value of the cookie is exactly the same as the value specified in Set-cookie, and there is no more recent parsing or transcoding operation for these values. If there are multiple cookies in the specified request, they are separated by semicolons and spaces, for example:

1 Cookie:value1 ; value2 ; name1=value1

The server-side framework typically provides the ability to parse cookies and programmatically retrieve the values of the cookies.

Cookie encoding (Cookie encoding)

There is always some confusion about the value of the cookie. The common view is that the value of a cookie must be URL encoded, but this is actually a fallacy, although the value of the cookie can be URL encoded. The original document indicates that only three types of characters must be encoded: semicolons, commas, and spaces. The specification mentions that URL encoding can be used, but not necessarily. The RFC does not mention any encodings. However, almost all implementations are URL-encoded with some columns for the value of the cookie. For the Name=value format, name and value are usually encoded separately and do not encode the equals sign "=".

Expiration options (the Expires option)

Each option that follows the cookie value is split with semicolons and spaces, and each option specifies when the cookie should be sent to the server. The first option is expires, which specifies when a cookie will no longer be sent to the server, so the cookie may be deleted by the browser. The value that corresponds to this option is a value in the format wdy,dd-mon--yyyy HH:MM:SS GMT, for example:

1 Set-Cookie:name=Nicholas;expires=Sat, 02 May 2009 23:38:25 GMT

In the absence of the Expires option, the lifetime of the cookie is limited to a single session. The closing of the browser means the end of this session, so the session cookie only exists in the state where the browser remains open. That's why you often see a checkbox when you sign in to a web app and ask if you choose to store your login information: If you choose Yes, then a expires option will be appended to the login cookie. If the Expires option sets a past point in time, the cookie is immediately deleted.

Domain Options (the domain option)

The next option is domain, which indicates which domain or domains the cookie will be sent to. By default, domain is set to the domain name of the page where the cookie was created. For example, the default value for the domain property of the cookie in this site is www.nczonline.com. The domain option is used to extend the number of domains that the cookie value is sent to. For example:

1 Set-Cookie:name=Nicholas;domain=nczonline.net

Imagine like Yahoo! Such a large site will have many sites in the form of name.yahoo.com (for example: my.yahoo.com,finance.yahoo.com, etc.). A single cookie can simply be sent to all of these sites by setting its domain option to yahoo.com. The browser makes a tail-comparison (that is, starting from the end of the string) for the domain value and the field to which the request is sent, and sends a cookie message header after the match.

The value of domain setting must be the domain name that sends the Set-cookie message header. For example, I can't send a cookie to google.com because this creates a security issue. The illegal domain option is simply ignored.

Path Options (the path option)

Another way to control when a cookie message header is sent is to specify the path option. As with the domain option, path indicates that a URL path must exist in the requested resource before the cookie message header is sent. This comparison is done by string comparison of the Path property value with the requested URL from the beginning. If the characters match, a cookie message header is sent, for example:

1 Set-Cookie:name=Nicholas;path=/blog

In this example, the path option value matches/blog,/blogrool and so on, and any option that starts with/blog is legal. Note that the Path property is only compared after the domain option has been verified. The default value of the Path property is the path portion of the URL in which the Set-cookie message header is sent.

Secure Options (the secure option)

The last option is secure. Unlike other options, this option is just a token and has no other value. A secure cookie is sent to the server side only when the request is created over SSL and HTTPS. The content of this cookie means that it is of high value and may potentially be cracked to be transmitted in plain text form. For example

1 Set-Cookie:name=Nicholas;secure

In reality, confidential and sensitive information should never be stored or transmitted in cookies because the entire mechanism of cookies is inherently unsafe. By default, cookies transmitted on HTTPS links are automatically added with the secure option.

Cookie maintenance and life cycle (cookie maintenance and lifecycle)

Any number of options can be specified in a single cookie, and these options can exist in any order, such as

1 Set-Cookie:name=Nicholas; domain=nczonline.net; path=/blog

This Cooke has four identifiers: the name,domain,path,secure tag of the cookie. To change the value of this cookie in the future, you need to send another Set-cookie message header with the same cookie Name,domain,path. For example:

1 Set-Cooke:name=Greg; domain=nczonline.net; path=/blog

This overwrites the value of the original cookie with a new value. However, just changing one of these options will also create a completely different cookie, such as:

1 Set-Cookie:name=Nicholas; domain=nczonline.net; path=/

After returning this message header, there will be two different cookies that have "name" at the same time. If you visit a page under Www.nczonline.NET/blog, the following message headers will be included:

1 Cookie:name=Greg;name=Nicholas

In this message header there are two Cookie,path values named "Name", the more detailed the cookie, the more forward. The more detailed the Domain-path, the more forward the cookie string. Let's say I'm under Ww.nczonline.net/blog and send another cookie with the following settings:

1 Set-Cookie:name=Mike

Then the returned message header now becomes:

1 Cookie:name=Mike;name=Greg;name=Nicholas

Since the cookie containing "Mike" uses the domain name (www.nczonline.net) as its domain value and takes the full path (/blog) as its path value, it is more detailed than the other two cookies.

Use expiration date (using expiration dates)

When a cookie is created with an expiration date, the expiration date is associated with a cookie that is identified by Name-domain-path-secure. To change the expiration date of a cookie, you must specify the same combination. When changing the value of a cookie, you do not have to set the expiration date every time because it is not part of the cookie identification information. For example:

1 Set-Cookie:name=Mike;expires=Sat,03 May 2025 17:44:22 GMT

Now that the cookie expiration date has been set, the next time I want to change the value of the cookie, I just need to use its name:

1 Set-Cookie:name=Matt

The expiration date on the cookie has not changed because the cookie identifier is the same. In fact, only you manually change the expiration date of the cookie, otherwise its expiration date will not change. This means that in the same session, a session cookie can become a persistent cookie (one that can exist in multiple sessions) and vice versa. In order to turn a persistent cookie into a session cookie, you must delete the persistent cookie, which can be achieved by setting its expiration date to create a session cookie of the same name after a certain period of time in the past.

Remember that the expiration date is verified on the basis of the system time on the computer running the browser. There is no way to verify that the system time is synchronized with the server's time, so there is an error setting when the server time differs from the browser's system time.

Automatic deletion of cookies (automatic cookie removal)

Cookies are automatically deleted by the browser, and there are several reasons for this:

    • Conversation Cooke (Session cookie) is deleted at the end of the session (browser off)
    • Persistent cookie (persistent cookie) will be deleted upon arrival of expiration date
    • If the cookie in the browser is restricted, the cookies are deleted to create a space for creating new cookies. See my other blog about cookies restrictions

For any of these automatic deletions, cookie management is important because these deletions are unconscious.

Cookie restrictions (Cookie restrictions)

There are a number of restrictions on cookies that prevent cookies from abusing and protecting browsers and servers from some negative effects. There are two types of cookies: the properties of the cookies and the total size of the cookies. The original specification contained no more than 20 cookies per domain name, and earlier browsers followed the specification, and there was a more recent upgrade in IE7. In a Microsoft update, they limit the number of cookies to 50 in IE7, while opera limits the number of cookies to 30.Safari and chrome to each domain name.

The maximum number of cookies that are sent to the server (space) is still maintained in the original specification: 4KB. All cookies that exceed this limit will be truncated and will not be sent to the server.

Subcookies

Given the limited number of cookies, the developer's subcookies view increases the storage of cookies. Subcookies is some of the name-value pairs stored in the value of a cookie, and is typically similar to the following format:

1 name=a=b&c=d&e=f&g=h

This approach allows multiple name-value pairs to be saved in a single cookie without exceeding the limit of the number of browser cookies. The negative effect of creating cookies in this way is that custom parsing is required to extract these values, and in comparison, the format of the cookies is simpler. The server-side framework has started to support subcookies storage. I write the Yui Cookie utility, which supports reading/writing in JavaScript subcookies

Cookies in JavaScript (cookies in JavaScript)

With the Document.cookie attribute in JavaScript, you can create, maintain, and delete cookies. This attribute is equivalent to the Set-cookie message header when the cookie is to be created, and is equivalent to the cookie message header when it is read. When creating a cookie, you need to use the same string as the Set-cookie expected format:

1 document.cookie="name=Nicholas;domain=nczonline.net;path=/";

Setting the value of the Document.cookie property does not delete all cookies stored on the page. It simply creates or modifies the specified cookie in the string. The next time a request is sent to the server, these cookies (set by Document.cookie) are sent to the server like other cookies set by the Set-cookie message header. There is no clear difference between all of these cookies.

To extract the value of a cookie using JavaScript, simply read it from the Document.cookie. The returned string is in the same format as the string in the cookie message header, so multiple cookies are separated by semicolons and strings. For example:

1 name1=Greg; name2=Nicholas

In view of this, you need to parse this cookie string manually to extract the real cookie data. There are a number of data that describe the use of Javascript to parse cookies, including my book, Professional JavaScript, so I don't have to explain it here. It is often easier to manipulate cookies with existing JavaScript libraries, such as Yui Cookie utility, which handles cookies in JavaScript rather than recreating them manually.

Cookies returned by visiting Document.cookie follow the same access rules as cookies sent to the server. To access cookies through JavaScript, the page and cookies must be in the same domain, have the same path, and have the same security level.

Note: Once the cookie has been set through JavaScript, it cannot extract its options, so you will not know the domain,path,expiration date or secure tag.

Http-only Cookies

Microsoft's IE6 SP1 introduced a new option in cookies: http-only cookies. Http-only behind the meaning of the browser the cookie should never be accessed through JavaScript's Document.cookie property. This feature is designed to provide a security measure to help prevent cross-site scripting attacks (XSS) that are initiated through JavaScript from stealing cookies (I'll discuss security issues in another blog post, so this is enough). Today Firefox2.0.0.5+,opera9.5+,chrome supports http-only cookies. The 3.2 version of Safari is still not supported.

To create a http-only cookie, simply add a http-only tag to your cookie:

1 Set-Cookie: name=Nicholas; HttpOnly

Once this token is set, the cookie can no longer be accessed through Documen.coookie. IE is one step closer and does not allow access to cookies through XMLHttpRequest's getallresponseheaders () or getResponseHeader () method, while other browsers allow this behavior. Firefox fixed the vulnerability in 3.0.6, but there are still many browser vulnerabilities that are listed in the Complete browser support list.

You can't set http-only cookies through JavaScript, because you can no longer read these cookies through JavaScript, which is a logical thing to do.

Summary (conclusion)

There are still a lot of things to know about stir to make efficient use of cookies. This is an incredible thing to know about a technology that was created more than more than 10 years ago but still used as it was originally implemented. This article only provides some basic guidance on browser cookies that everyone should know about, but it is not a complete reference anyway. Cookies still play a very important role in today's web, and inappropriate management of cookies can lead to a variety of security issues, from the worst user experience to security vulnerabilities. I hope this handbook will inspire some incredible highlights about cookies.

Written in the following:

The article is 2009, to now may have counted a relatively old article, but the article about the cookie explanation is very detailed, recently in the study of cookies, read the article, I think it is worth reading, but also as their own reference bar, so the original translated into Chinese. This article explains in more detail the origin of the cookie, the problem to be solved, the nature of the cookie itself and the function of each attribute, and the implementation and extension of the cookie by the relevant browser, as the author has expressed, As a technology that was created more than 10 years ago, it is worth every developer to think and understand stir understand the basic information and principles of cookies, the translation to ensure that the original intention to restore, but the level is limited, some sentence translation is jerky, the proposal and the original comparison reading, The effect may be a little better.

About Nicholas C. Zakas, senior front-end engineer, who has worked in Yahoo for nearly five years, and as the front-end development technology leader, his book, "JavaScript Advanced Program Design" is a high-quality front-end technical works.

Note: Please save the original author link when the original link is reproduced.

In addition: Cookies are present in both the server and the browser

In-depth understanding of cookies

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.