Full parsing of Java cookies (mechanism and principle of session and cookie)

Source: Internet
Author: User
Tags intl



Abstract: Although the session mechanism has been adopted in Web applications for a long time, many people still do not know the nature of the session mechanism, and even cannot correctly apply this technology. This article will discuss in detail the working mechanism of the session and answer frequently asked questions about the application of the session mechanism in Java Web applications.
I. Term session
Ii. http protocol and status maintenance
3. Understanding COOKIE Mechanism
4. Understanding the session mechanism
5. Understand javax. servlet. http. httpsession
Vi. httpsession FAQs
7. Cross-Application Session sharing
VIII. Summary
I. Term session
In my experience, the word session is abused only after transaction. What's more interesting is that transaction and session have the same meaning in some contexts.
Session, which is often translated into sessions in Chinese. Its original meaning refers to a series of actions/messages that start and end, for example, a series of processes from picking up a phone call and dialing to hanging up a phone call can be called a session. Sometimes we can see this: "during a browser session ,... the term session here uses its original meaning, which refers to the period from opening a browser window to closing it. The most confusing is the phrase "a user (client) is in a session", which may refer to a series of actions of the user (generally a series of actions related to a specific purpose, for example, an online shopping process, from login to purchase of goods to checkout and logout, is also called a transaction. However, sometimes it may only refer to a connection, it may also be the meaning ①, where the difference can only be inferred by context ②.
However, when a session is associated with a network protocol, it often implies two meanings: "connection-oriented" and "/" persistence, "connection orientation" refers to the establishment of a communication channel before the communication parties establish a communication channel, such as a phone call, until the other party receives a telephone communication. In contrast, it refers to writing a letter, when you send a letter, you cannot confirm whether the address of the other party is correct. The communication channel may not be established, but for the sender, the communication has started. "Keep status" means that the communication party can associate a series of messages so that messages can be mutually dependent, for example, a waiter can recognize an old customer who has visited the store again and remembers that the customer still owes a dollar to the store. In this example
Session "or" one POP3 session "③.
In the era of vigorous development of web servers, session semantics in the context of web development has been expanded, it refers to a kind of solution that maintains the status between the client and the server. Sometimes session refers to the storage structure of this solution, such as "Saving XXX in session" ⑤. Various languages used for web development provide support for this solution to a certain extent. Therefore, in a specific language context, session is also used to refer to the solution of this language, for example, the javax. servlet. HTTP. httpsession is short for session 6.
As this confusion cannot be changed, the use of session in this article will also have different meanings according to the context.
In this article, we use the Chinese "browser session period" to express the meaning ①, the "session mechanism" to express the meaning ④, and the "session" to express the meaning ⑤, use the specific "httpsession" to express the meaning 6
Ii. http protocol and status maintenance
The HTTP protocol itself is stateless, which is consistent with the original purpose of the HTTP protocol. The client simply needs to request the server to download some files, there is no need to record the previous behavior of each other on both the client and server. Each request is independent, like a customer, a vending machine, or a common (non-membership) the relationships between supermarkets are the same.
But smart (or greedy ?) People soon discovered that providing on-Demand dynamic information will make the web more useful, just like adding the on-demand function to cable TV. On the one hand, this requirement forces HTML to gradually add client behaviors such as forms, scripts, and Dom, and on the other hand, there is a CGI specification on the server side to respond to dynamic requests from the client, the HTTP protocol, which acts as the transmission carrier, also adds the file upload and cookie features. Among them, cookies are used to solve the stateless defects of HTTP. As for the later session mechanism, it is another solution that maintains the status between the client and the server.
Let's use several examples to describe the difference and connection between the cookie and session mechanism. I often went to a coffee shop and offered a free discount for five coffee cups. However, there is little chance of consuming five coffee cups at a time, in this case, you need to record the consumption quantity of a customer in some way. Imagine the following solutions:
The store clerk is very good. He can remember the consumption quantity of each customer. As long as the customer enters the coffee shop, the clerk will know how to deal with it. This method is supported by the Protocol itself.
, Send a card to the customer, which records the amount of consumption and generally has a validity period. For each consumption, if the customer shows this card, the current consumption will be associated with the previous or later consumption. This approach is to maintain the status on the client.
Send a membership card to the customer. No information except the card number is recorded. If the customer shows the card at each consumption, then the clerk finds the log corresponding to this card number in the store's record to add some consumption information. This approach is to maintain the status on the server side.
Because the HTTP protocol is stateless and does not need to be stateful for various reasons, the next two solutions have become a realistic choice. Specifically, the cookie mechanism adopts the client-side persistence scheme, while the session mechanism adopts the server-side persistence scheme. At the same time, we can also see that because the server-side persistence scheme also needs to save an identifier on the client, the session mechanism may need to use the COOKIE Mechanism to save the identifier, but in fact it has other options.
3. Understanding COOKIE Mechanism
The basic principle of the cookie mechanism is as simple as the above example, but there are several problems to solve: how to distribute "membership cards", the content of "membership cards", and how customers use "membership cards ".
The orthodox cookie distribution is implemented by extending the HTTP protocol. The server prompts the browser to generate the corresponding cookie by adding a special line in the HTTP response header. However, pure client scripts such as JavaScript or VBScript can also generate cookies.
Cookies are automatically sent to the server in the background by the browser according to certain principles. The browser checks all stored cookies. If the declared range of a cookie is greater than or equal to the location where the requested resource is located, the cookie is attached to the HTTP request header of the requested resource and sent to the server. This means that the McDonald's membership card can only be presented in the McDonald's store. If a branch still has its own membership card, in addition to the McDonald's membership card, the store's membership card is also presented.
Cookie content mainly includes: name, value, expiration time, path and domain.
The domain can specify a domain such as .google.com, which is equivalent to a main store sign. For example, Procter & Gamble can also specify a specific machine in a domain such as www.google.com or froogle.google.com, you can use rejoice for comparison.
The path is the URL path following the domain name, for example, // or/Foo. You can use a certain rejoice counter to compare it.
The combination of paths and domains constitutes the scope of cookie.
If no expiration time is set, it indicates that the life cycle of the cookie is the browser session period. When the browser window is closed, the cookie disappears. This cookie is called a session cookie. Session cookies are generally stored in the memory instead of on the hard disk. Of course, this behavior is not standardized. If the expiration time is set, the browser will save the cookie to the hard disk, and then open the browser again. These cookies are still valid until the preset expiration time is exceeded.
Cookies stored on hard disks can be shared among different browser processes, such as two IE Windows. For Cookies stored in the memory, different browsers have different processing methods. For IE, pressing Ctrl-N (or from the File menu) in an open window can share the window with the original window, other new ie processes cannot share the memory cookies of opened windows. for Mozilla firefox0.8, all processes and tabs can share the same cookies. Generally, the window opened with window. Open in Javascript will share the memory cookie with the original window. The browser often causes a lot of trouble for Web application developers who use the session mechanism to process session cookies.
The following is an example of how goolge sets the cookie response header.
HTTP/1.1 302 found
[Url = http://www.google.com/intl/zh-cn/?http://www.google.com/intl/zh-cn/
Set-COOKIE: Pref = id = 0565f77e132de138: nw = 1: TM = 1098082649: LM = 1098082649:
S = kaeacfpo49ria_d8; expires = Sun, 17-Jan-2038 19:14:07 GMT; Path =/; domain = .google.com
Content-Type: text/html
This is part of the HTTP Communication record captured by the HTTP sniffer software httplook.
The browser automatically sends a cookie when accessing goolge resources again.
Using Firefox, you can easily observe the existing cookie values.
Using httplook with Firefox can easily understand how cookies work.
Ie can also be set to ask before accepting cookies
This is a dialog box asking to accept cookies.
4. Understanding the session mechanism
The session mechanism is a server-side mechanism. The server uses a structure similar to a hash (or a hash) to save information.
When the program needs to create a session for a client request, the server first checks whether the client request contains a session ID called the session ID, if a session ID is included, it indicates that a session has been created for this client before, and the server uses the session ID to retrieve the session. (if the session ID is not found, a new one may be created ), if the client request does not contain the session ID, the client creates a session and generates a session ID associated with the session. The session ID value should be unique, it is not easy to find a regular character string to be counterfeited.
The session ID will be returned to the client for saving in this response.
The cookie can be used to save the session ID. In this way, the browser can automatically display the ID to the server according to the Rules during the interaction. Generally, the cookie name is similar to seeesionid. For example, for WebLogic cookies generated by web applications, JSESSIONID = byok3vjfd75apnrf7c2hmdnv6qzcebzwowibyenlerjq99zwpbng! -145788764, whose name is JSESSIONID.
Because cookies can be artificially disabled, there must be other mechanisms so that session IDs can still be passed back to the server when cookies are disabled. A frequently used technology called URL rewriting is to directly append the session ID to the end of the URL path. There are two additional methods, one is as the additional information of the URL path, the format is
[Url = http: //.../xxx; JSESSIONID =] http: //.../xxx; JSESSIONID =

Byok3vjfd75apnrf7c2hmdnv6qzcebzwowibyenlerjq99zwpbng! -145788764
The other is appended to the URL as a query string, in the format
[Url = http: //.../XXX? JSESSIONID = byok3vjfd75apnrf7c2hmdnv6qzcebzwowibyenlerjq99zwpbng! -145788764] http: //.../XXX? JSESSIONID = byok3vjfd75apnrf7c2hmdnv6qzcebzwowibyenlerjq99zwpbng! -145788764
There is no difference between the two methods for users, but they are handled differently by servers during parsing, the first method also helps to distinguish the session ID information from the normal program parameters.
To maintain the status throughout the interaction process, the session ID must be included after the path that each client may request.
Another technique is form hidden fields. The server automatically modifies the form and adds a hidden field so that the session ID can be passed back to the server when the form is submitted. For example, the following form
It will be rewritten
This technology is rarely used now. I have used iplanet6, the predecessor of the SunONE application server.
In fact, this technology can be simply replaced by rewriting the URL of the action application.
When talking about the session mechanism, we often hear the misunderstanding that "the session disappears as long as the browser is closed ". In fact, you can imagine the example of a membership card. Unless the customer initiates a card sales proposal for the store, the store will never easily Delete the customer's information. The same applies to sessions. Unless the program notifies the server to delete a session, the server will keep it. Generally, the program sends a command to delete the session when the user logs off. However, the browser will never notify the server that it is about to close before it closes, so the server will not have the opportunity to know that the browser has been closed, most session mechanisms use session cookies to save sessions.
The session ID disappears after the browser is closed, and the original session cannot be found when the server is connected again. If the cookie set by the server is saved to the hard disk, or the HTTP request header sent by the browser is rewritten by some means, the original session ID is sent to the server, then you can still find the original session when you open the browser again.
It is precisely because closing the browser will not cause the session to be deleted, forcing the server to set an expiration time for the seesion. When the last time the session was used by the client exceeds this expiration time, the server considers that the client has stopped the activity before deleting the session to save storage space.
5. Understand javax. servlet. http. httpsession
Httpsession is the Java platform's Implementation specification for the session mechanism, because it is only an interface, specific to the provider of each Web application server,

In addition to supporting the specifications, there are still some minor differences that are not specified in the specifications.

Here we use the Weblogic server8.1 of BEA as an example.
WebLogic Server provides a series of parameters to control the implementation of its httpsession, including the cookie switch option, the URL rewrite switch option, and session persistence settings, set the session expiration time and cookie settings, such as the cookie name, path, domain, and cookie survival time.
Generally, sessions are stored in the memory. When the server process is stopped or restarted, the sessions in the memory are also cleared. If session persistence is set, the server saves the session to the hard disk. When the server process is restarted or the information can be used again, webLogic Server supports persistence methods including file, database, and client cookie storage and replication.
Replication is not stored persistently, because the session is actually stored in the memory, but the same information is copied to the server processes in each cluster, in this way, even if a server process stops working, the session can still be obtained from other processes.
The cookie survival time setting affects whether the cookie generated by the browser is a session cookie. Session cookies are used by default. If you are interested, you can use it to test the misunderstanding we mentioned in section 4.
The Cookie Path is a very important option for Web applications. WebLogic Server's default processing method for this option makes it significantly different from other servers. We will discuss this topic later.
For more information about session settings, see [5].
[Url = Response
Vi. httpsession FAQs
(In this section, the session meaning is a mixture of 5 and 6)
1. When the session is created
A common misunderstanding is that the session is created when a client accesses it. However, the fact is that the session is created until a server program calls httpservletrequest. the getsession (true) Statement is created only when it is not displayed, when the JSP file is compiled into servlet, the following statement is automatically added: httpsession session = httpservletrequest. getsession (true); this is also the source of the implicit Session Object in JSP.
Because the session consumes memory resources, if you do not plan to use the session, you should disable it in all JSPs.
2. When the session is deleted
Based on the previous discussion, the session is deleted under the following circumstances. the program calls httpsession. invalidate (); or B. the interval between the session ID sent by the client last time exceeds the Session Timeout setting;

Or C. The server process is stopped (non-persistent session)
3. How to delete a session when the browser is closed
Strictly speaking, this cannot be done. One way to do this is to use the JavaScript code window. oncolose on all client pages to monitor the closing action of the browser,

Then, send a request to the server to delete the session. However, there is no way to break down the browser or forcibly kill the process.
What is the problem with an httpsessionlistener?
You can create listener to monitor the creation and destruction events of sessions, so that you can do some relevant work when such events occur.

Note that the listener action is triggered by the session creation and destruction, rather than the opposite. Similar listener related to httpsession include httpsessionbindinglistener, httpsessionactivationlistener, and httpsessionattributelistener.
4. Must the objects stored in the session be serializable?
Not required. Objects are required to be serializable only for the session to be copied in the cluster or to be permanently saved or, if necessary, the server can temporarily swap the session out of memory.

If you place an unserializable object in the WebLogic Server session, you will receive a warning on the console.

If a session of an iPlanet version that I have used contains an object that cannot be serialized, an exception occurs when the session is destroyed, which is strange.
, How can we properly cope with the possibility that the client will prohibit cookies?
Use URL rewriting for all URLs, including hyperlinks, form actions, and redirection URLs. For more information, see [6].
[Url = Response
5. Opening two browser windows to access the application will use the same session or different sessions.
For more information, see section 3 on cookie. For session, the session only recognizes IDs and does not recognize people. Therefore, different browsers, different window opening methods and different cookie storage methods will affect the answer to this question.
6. How to Prevent session confusion caused by opening two browser windows?
This problem is similar to preventing forms from being submitted multiple times. It can be solved by setting the token of the client. It means that each time the server generates a different ID and returns it to the client and saves it in the session, the client must also return this ID to the server when submitting the form, the program first checks whether the returned ID is consistent with the value saved in the session. Otherwise, this operation has been submitted. See the section on presentation layer in J2EE Core mode. It should be noted that for the use of JavaScript window. generally, this ID is not set for an open window, or a separate ID is used to prevent the main window from being operated. in the open window, modify the settings.
7. Why do I need to call session. setvalue again after changing the session value in Weblogic server?
The main purpose of this operation is to prompt that the WebLogic Server session value has changed in the cluster environment. You need to copy the new session value to other server processes.
Why is the session missing?
Aside from the normal failure of the session, the server itself may be very unlikely, although I have also encountered some patches in the Solaris version of iplanet6sp1; the possibility of browser plug-ins is second to that, I have also encountered problems caused by 3721 plug-ins. Theoretically, the firewall or proxy server may also have problems in cookie processing.
Most of the reasons for this problem are program errors. The most common reason is to access another application in one application. We will discuss this issue in the next section.

7. Cross-Application Session sharing
This is often the case where a large project is divided into several small projects for development. In order to be able to do not interfere with each other, each small project is required to be developed as a separate web application, in the end, however, we suddenly found that some information needs to be shared between several small projects, or we wanted to use session to implement SSO (Single Sign on) and save the login user information in the session, the most natural requirement is that applications can access each other's sessions.
However, according to the servlet specification, the scope of the session should be limited to the current application. Different applications cannot access each other's sessions. Each application server complies with this specification in terms of actual results, but the implementation details may vary. Therefore, the methods for cross-application session sharing vary.
First, let's take a look at how Tomcat isolates sessions between Web applications. From the cookie path set by Tomcat, it sets different cookie paths for different applications, in this way, the session IDs used by different applications are different. Therefore, even if you access different applications in the same browser window, the session IDs sent to the server can be different.

Based on this feature, we can infer that the memory structure of the session in Tomcat is roughly as follows.
I used iPlanet in the past in the same way. It is estimated that there will not be much difference between SunONE and iPlanet. For servers in this way, the solution is simple and practical. Either allow all applications to share a session ID or allow the application to obtain the session ID of other applications.
IPlanet has a very simple way to share a session ID, that is, to set the Cookie Path of each application to/(actually it should be/nasapp, for an application, it serves as the root ).
It should be noted that the shared session should follow some programming conventions, such as adding the application prefix before the session attribute name, so that setattribute ("name", "Neo ") setattribute ("app1.name", "Neo") to prevent namespace conflicts and overwrite each other.
In tomcat, there is no such convenient choice. In tomcat version 3, we can also share sessions. For Tomcat Versions later than version 4, I have not found a simple method. You can only use the power of a third party, such as using files, databases, JMS, or client cookies, URL parameters, or hidden fields.
Let's take a look at how WebLogic Server Processes sessions.

On the screenshot, we can see that the cookie path set by Weblogic server for all applications is/. Does this mean that the session can be shared by default on WebLogic Server? However, a small experiment proves that even if different applications use the same session, each application can only access the attributes set by itself. This indicates that the memory structure of the session in WebLogic Server may be as follows:
For such a structure, it is impossible to solve the session Sharing Problem in the session mechanism itself. In addition to the power of a third party, such as the use of files, databases, JMS or client cookies, URL parameters, or hidden fields, there is also a more convenient approach, put the session of an application into servletcontext, so that another application can obtain the reference of the previous application from servletcontext. The sample code is as follows,
Context. setattribute ("Appa", session );
Application B
Contexta = context. getcontext ("/Appa ");
Httpsession sessiona = (httpsession) contexta. getattribute ("Appa ");
It is worth noting that this kind of usage cannot be transplanted, because according to the javadoc of servletcontext, the application server can be in security for context. getcontext ("/Appa"); returns a null value. The preceding method is used in WebLogic Server 8.1.
So why does WebLogic Server set the Cookie Path of all applications? It was originally for SSO. All applications that share this session can share the authentication information. A simple experiment can prove this by modifying the descriptor of the application that was first logged on to weblogic. XML, changing the Cookie Path to/Appa to access another application will re-require logon. Even if it is reversed, access the application whose Cookie Path is/first, and then access the modified path, although logon is not prompted, the user information will be lost. Note that the form authentication method should be used in this experiment, because browsers and web servers have other processing methods for the basic authentication method, and the second request authentication does not pass
Session. For details, see [7] secion 14.8 authorization. You can modify the example program to perform these experiments.
VIII. Summary
The session mechanism is not complex, but its implementation and configuration flexibility make the specific situation complex and changeable. This also requires us not to regard the experience of a single browser or server as a general experience, but to always analyze the specific situation.
Abstract: Although the session mechanism has been adopted in Web applications for a long time, many people still do not know the nature of the session mechanism, and even cannot correctly apply this technology. This article will discuss in detail the working mechanism of the session and answer frequently asked questions about the application of the session mechanism in Java Web applications.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.