Java Theory and Practice: State replication of the WEB layer

Source: Internet
Author: User
Tags contains copy failover file system require sessions jboss tomcat
Most Web applications with a certain importance on the Web require maintenance of a session state, such as the contents of a user's shopping cart. How you manage and replicate state in a clustered server application has a significant impact on the scalability of your application. Many J2SE and Java EE Applications store state in the HttpSession provided by the Servlet API. This month, columnist Brian Goetz analyzes some of the options for State replication and how to use HttpSession most effectively to provide good scalability and performance. Share your views with the author and other readers of this article in this forum. (You can click the Discussion Access forum at the top or bottom of the article.) )


Whether you are building a Java EE or a J2SE server application, it is possible to use the servlet--in some way, either directly through a presentation layer such as JSP technology, Velocity, or Webmacro, or through a serv Let's Web service implementations, such as Axis or Glue. One of the most important functions provided by the Servlet API is session management-authentication, expiration, and maintenance of user status through the HttpSession interface.





Session State


almost every WEB application has some session state, which can be as simple as remembering whether you are logged in, or a more detailed history of your session, such as the contents of a shopping cart, the cache of previous query results, or the complete response history of a 20-page dynamic questionnaire. Because the HTTP protocol itself is stateless, the session state needs to be stored somewhere and associated with the browsing session in a way that makes it easy to retrieve the next time a page of the same WEB application is requested. Fortunately, Java EE offers several ways to manage session state-the state can be stored in the data tier, stored in the WEB layer with the Servlet API's HttpSession interface, and stored in a stateful session bean at the Enterprise JavaBeans (EJB) layer. Even use cookies or hide form fields to store the state at the customer level. Unfortunately, improper session state management can cause serious performance problems.





This method is usually better than other methods if the application can store user state in HttpSession. Storing session state with HTTP cookies or hidden form fields on the client is a significant security risk-it exposes part of the application's internal content to the untrusted customer layer. (An early E-commerce Web site stores shopping cart content (including prices) in hidden form fields, which can be easily exploited so that anyone who understands HTML and HTTP can buy any item for 0.01 dollars.) Oh. In addition, using cookies or hiding form fields is confusing, error-prone, and fragile (if the user forbids cookies in the browser, the cookie-based approach is completely out of the way).





other ways to store server-side state in a Java EE application are to use stateful session beans or to store session state in a database. Although stateful session beans have greater flexibility in session state management, it is still beneficial to store session state in the Web tier where possible. If the business object is stateless, it is generally possible to simply add more Web servers to extend the application without adding more Web servers and more EJB containers, which is generally less expensive and easier to complete. Another benefit of using HttpSession to store session state is that the Servlet API provides an easy way to notify when a session fails. The cost of storing session state in the database may be unbearable.




The
servlet specification does not require a servlet container for some type of session replication or persistence, but it recommends that State replication be an important part of the servlet's primary raison d ' etre, and that it makes a number of requirements for a container to replicate as a session. Session replication can provide a number of benefits-load balancing, scalability, fault tolerance, and high availability. Accordingly, most servlet containers support some form of HttpSession replication, but the mechanism, configuration, and timing of replication are determined by implementation.





HttpSession API


Simply put, the HttpSession interface supports several methods that servlet, JSP pages, or other presentation layer components can use to maintain session information across multiple HTTP requests. The session is bound to a specific user, but is shared in all the servlet of the WEB application--not specific to a servlet. A useful way of thinking about sessions is that a session is like a map--that stores objects during a session and can use SetAttribute to store session properties by name and extract them with getattribute. The HttpSession interface also contains the session lifetime method, such as Invalidate (), which notifies the container that the session should be discarded. Listing 1 shows the most commonly used elements of the HttpSession interface:





Listing 1. HttpSession API





public interface HttpSession {


Object getattribute (String s);


enumeration Getattributenames ();


void setattribute (String s, Object O);


void RemoveAttribute (String s);





boolean isnew ();


void invalidate ();


void Setmaxinactiveinterval (int i);


int getmaxinactiveinterval ();


...


}











theoretically, the session state can be completely replicated across the cluster, so that all nodes in the cluster can serve any request, and a simple load balancer can send requests in polling mode to avoid the failed host. However, this compact replication has high performance costs and is difficult to implement, and there are scalability issues when the cluster approaches a certain size.





A more common approach is to combine load-balancing session similarity (affinity)-a load balancer can associate sessions with connections and send future requests from a session to the same server. There are many hardware and software load balancers that support this feature, and this means that only primary connection hosts and sessions need to fail over to another server to access replicated session information.





Copy Mode


replication offers some possible benefits, including availability, fault tolerance, and scalability. In addition, there are a number of methods available for session replication: the choice of methods depends on the size of the application cluster, the replication targets, and the replication facilities supported by the servlet container. Replication has performance costs, including CPU cycles (serialized objects stored in a session), network bandwidth (broadcast updates), and the cost of writing to a disk or database in a disk-based scenario.





almost all servlet containers are HttpSession replicated through serialized objects stored in HttpSession, so if you are creating a distributed application, you should ensure that only serializable objects are placed in the session. (Some containers have special handling for objects such as EJB references, transaction contexts, and other non serializable Java EE object types.) )





replication
based on JDBC

a method of session replication is to serialize the session content and write it to the database. This approach is fairly intuitive, with the advantage that not only sessions can fail over to other hosts, but session data can also be saved even if the entire cluster fails. The disadvantage of database based replication is the cost of performance-database transactions are expensive. While it can scale well on the WEB layer, it may produce scalability problems at the data layer--if the cluster grows to a certain extent, extending the data tier to accommodate session data can be difficult or cost unacceptable.





file-based Replication


replication is similar to using a database to store serialized sessions, except to store session data using a shared file server instead of a database. The cost of this approach is generally lower than the cost of using the database (hardware costs, software licenses, and computational overhead), at the cost of reliability (the database provides a more durable guarantee than the file system).





memory-based replication


Another way to replicate is to share a copy of the serialized session data with one or more other servers in the cluster. Replicating all sessions provides maximum availability in all hosts, and load balancing is easiest, but because the memory and network bandwidth of each node consumed by the replication message ultimately limits the size of the cluster. Some application servers support buddy replication with the "partner" node, where each session exists on the primary server and on one (or more) backup server. This scenario is much more scalable than replicating all sessions to all servers, but it complicates the load balancing task when you need to failover the session to another server, because it must find out which other (several) servers have this session.





Time to consider


In addition to deciding how to store copy session data, there is a question of when to replicate data. The most reliable but also the most expensive method is to copy it every time the data changes (such as when the servlet call ends). Less expensive, but the risk of losing some data in the case of a failure is to replicate data every N seconds.




The question of
is whether to copy an entire session or just try a changed attribute in a copy session (it contains much less data). These require trade-offs between reliability and performance. Servlet developers should recognize that session state may become "obsolete" in the case of a failover (a replication before several requests) and should be prepared to handle content that is not the most recent session. (for example, if the 3rd step of a interview produces a session property, when the user is in step 4th, the request is failed over to a system with session state replication prior to two requests, then the 4th servlet code should be prepared to not find this attribute in the session and take action accordingly-as Redirect, rather than determining that it will be there, and throw a nullpointerexception when it is not found. )





Container Support

The HttpSession replication options for the
Servlet container and how to configure these options are different. IBM websphere® offers the most replication options, which provide options for replicating in memory or database replication, at the end of the servlet or time based replication time, propagating all session snapshots (JBoss 3.2 or later), or propagating only changed attributes. Memory based replication is based on a JMS publish-subscribe, which can be replicated to all clones, a "partner" replica, or a dedicated replication server.





WebLogic also provides a set of choices, including in-memory (using a replica of a partner), file-based, or database based. When JBoss is in use with Tomcat or the Jetty servlet container, it makes a memory-based replication that selects the end of the servlet or the time based replication time, and the snapshot option (in JBoss 3.2 or later) is to replicate only the changed attributes. TOMCAT 5.0 provides memory-based replication for all cluster nodes. In addition, through projects like WADI, you can use the servlet filtering mechanism to add session replication to a servlet container such as Tomcat or Jetty.





improve the performance of distributed WEB applications


no matter what mechanism you decide to use for session replication, you can improve the performance and scalability of your WEB application in several ways. First, remember that in order to gain the benefits of session replication, you need to mark the Web application as distributable in the deployment descriptor and ensure that everything in the session is serializable.





Keep Session Minimum


because the copy session has increased costs as the object graph in the session increases, you should place as little data as possible in the session. Doing so reduces the cost of serialization for replication, network bandwidth requirements, and disk requirements. In particular, it is generally not a good idea to store shared objects in a session because they need to be replicated to each session to which they belong.





Don't bypass setattribute


when changing the properties of a session, know that even if the servlet container is simply trying to make the smallest update (propagating only the changed attributes), the container might not have noticed the changed property without invoking the setattribute. (Imagine a vector in a conversation that represents a commodity in a shopping cart--If you call GetAttribute () to get the vector, then add some content to it, and do not call setattribute again, the container may not realize that the Vector has changed.) )





uses fine-grained session properties


For containers that support minimal updates, you can reduce the cost of session replication by putting multiple fine-grained objects instead of a big one in the conversation. In this way, changes to rapidly changing data do not force the container to serialize and propagate slow-changing data.





after the completion of the failure


If you know that the user has completed the use of the session (for example, the user chooses to log off), make sure to invoke Httpsession.invalidate (). Otherwise, the session will persist until it is invalidated, which consumes memory and may be long (depending on the session timeout). Many servlet containers have a limit to the amount of memory that can be used across all sessions, and when this limit is reached, the first used session is serialized and written to disk. If you know that the user has finished using the session, you can make the container stop processing it and invalidate it.





Keep Session Clean


If there are large items in the session and are used only in part of the session, they should be deleted when they are no longer needed. Deleting them can reduce the cost of session replication. (This approach is similar to using explicit nulling to help the garbage collector, and the old reader knows I don't recommend it generally, but in this case, because of the duplication, the cost of keeping the garbage in session is much higher, so it's worthwhile to help the container in this way.) )





Concluding remarks


through HttpSession replication, the Servlet container can give you a lot of weight in building a replicated, high-availability WEB application. However, there are some configuration options for replication, each container is different, and the choice of replication policy has an impact on the fault tolerance, performance, and scalability of the application. The choice of a replication policy should not be an afterthought-you should consider it when building a WEB application. Also, be sure not to forget to load test to determine the scalability of your application-before the customer does it for you.








Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.