Java Theory and Practice: web layer state Replication

Source: Internet
Author: User
Tags jboss

Whether you are building a J2EE or j2se server applicationProgramAnd may all use Java Servlet in some way-either directly through a presentation layer like JSP, velocity, or webmacro, or through a servlet-based Web service, for example, axis or glue. Session management is one of the most important functions provided by Servlet APIs.HttpsessionInterface for user status authentication, failure, and maintenance.

Session Status

Almost every web application has some session states, which may be as simple as remembering whether you have logged on or not, or may be a more detailed history of your session, such as the content of the shopping cart, the cache of the previous query results, or the complete response history of the 20-Page Dynamic question table. Because the HTTP protocol itself is stateless, you need to store the session status somewhere and associate it with the browser session in a certain way, this makes it easy to obtain the page of the same web application the next time. Fortunately, J2EE provides several methods to manage session States-States can be stored on the data layer and Servlet APIsHttpsessionInterfaces are stored on the web layer, and stateful session beans are stored on the Enterprise JavaBeans (EJB) layer. Even the state is stored on the customer layer using cookies or hidden form fields. Unfortunately, improper session state management can cause serious performance problems.

If the application canHttpsessionTo store the user status. This method is generally better than other methods. There is a great security risk when the client uses HTTP cookies or hidden form fields to store session states-It exposes some of the internal content of the application to the untrusted customer layer. (An Early e-commerce website stores shopping cart content (including prices) in hidden form fields, which can be easily exploited illegally, this allows anyone who understands HTML and HTTP to buy any item for $0.01. Oh) in addition, using cookies or hiding form fields is confusing, error-prone, and fragile (if users are not allowed to use cookies in browsers, the cookie-based method will not work at all ).

Other methods for storing server status in J2EE applications are to use stateful session beans or store session statuses in databases. Although stateful session beans provide greater flexibility in session State management, it is still advantageous to store session states on the web layer when possible. If the business object is stateless, you can add more web servers to expand the application without adding more web servers and more EJB containers, such costs are generally lower and easier to complete. UseHttpsessionAnother advantage of session state storage is that servlet API provides an easy way to notify when a session fails. The cost of storing the session Status in the database may be unbearable.

The servlet specification does not require the servlet container to perform certain types of session replication or persistence. However, it is recommended that the State replication be the primary feature of the servlet.Raison d' etre)And it puts forward some requirements for the container for session replication. Session replication provides many benefits-load balancing, scalability, fault tolerance, and high availability. Correspondingly, most servlet containers support some formHttpsessionReplication, but the mechanism, configuration, and time of replication are determined by the implementation.

Httpsession API

Simply put,HttpsessionThe interface supports several methods. servlet, JSP page, or other presentation layer components can use these methods to maintain session information across multiple HTTP requests. The session is bound to a specific user, but shared among all servlets of the Web application-not specific to a specific servlet. A useful method to consider sessions is that sessions are like objects stored during sessionsMap-- AvailableSetattributeStore session attributes by name and useGetattributeExtract them.HttpsessionThe interface also contains the session lifecycle method, as shown in figureInvalidate ()(It notifies the container to discard sessions ). List 1 displayHttpsessionThe most common elements of an interface:

Listing 1. httpsession API

 1   Public   Interface  Httpsession {  2   Object getattribute (string S );  3   Enumeration getattributenames (); 4       Void  Setattribute (string S, object O );  5       Void  Removeattribute (string S );  6       Boolean  Isnew ();  7       Void  Invalidate ();  8       Void Setmaxinactiveinterval ( Int  I ); 9       Int  Getmaxinactiveinterval ();  10   ...  11 }

Theoretically, the session state can be completely replicated across clusters in a consistent manner, so that all nodes in the cluster can serve any request, and a simple Load balancer can send requests in polling mode, avoid faulty hosts. However, this kind of close replication has a high performance cost and is difficult to implement. When the cluster is close to a certain scale, there will be scalability problems.

A more common method is to combine Server Load balancer with affinity. The Server Load balancer can associate sessions with connections, and send requests from the session to the same server. Many hardware and software load balancers support this function, and this means that only when the master connection host and session need to fail over to another server can access the copied session information.

Replication Mode

Replication provides some possible benefits, including availability, fault tolerance, and scalability. In addition, there are a lot of methods available for session replication: The method selection depends on the size of the application cluster, the replication target, and the replication Facility supported by the servlet container. Replication has performance costs, including CPU cycles (serialized objects stored in sessions), network bandwidth (broadcast updates), and costs of writing data to disks or databases in disk-based solutions.

Almost all servlet containers are stored inHttpsessionSerialized object inHttpsessionReplication, so if you are creating a distributed application, make sure that only the serializable object is put into the session. (Some containers have special processing for J2EE object types such as EJB reference, transaction context, and other non-serializable ones .)

JDBC-based Replication

One method of session replication is to serialize the session content and write it into the database. This method is quite intuitive, with the advantage that not only can sessions fail over to other hosts, but also can save session data even if the entire cluster fails. The disadvantage of database-based replication is the performance cost-database transactions are expensive. Although it can be well scaled on the web layer, it may cause scaling problems on the data layer-if the cluster grows to a certain extent, it is difficult or unacceptable to extend the data layer to accommodate session data.

File-based Replication

File-based replication is similar to storing serialized sessions in a database, but uses a shared file server instead of a database to store session data. The cost of this method is generally lower than the cost of using the database (hardware cost, software license and computing cost), and the cost is reliability (the database can provide a stronger persistence guarantee than the file system ).

Memory-based Replication

Another replication method is to share serialized session data copies with one or more other servers in the cluster. Copying all sessions to all hosts provides the maximum availability and load balancing is the easiest. However, because copying messages consumes the memory and network bandwidth of each node, eventually, the cluster size will be limited. Some application servers support memory-based replication with the "buddy" node. Each session exists on the master server and one (or more) backup server. This scheme is more scalable than copying all sessions to all servers, but it will complicate the load balancing task when session failure needs to be transferred to another server, because it must find out which other (several) servers have this session.

Time considerations

In addition to deciding how to store the copy session data, there is also a question about when to copy the data. The most reliable but expensive method is to copy data every time the data changes (for example, when each servlet call ends ). The method is not that expensive, but there is a risk of data loss in the case of a fault is to copy the data every more than n seconds.

The problem related to the time issue is whether to copy the entire session or try to copy the attributes changed in the session (it contains much less data ). All of these require a trade-off between reliability and performance. Servlet developers should recognize that the session status may become "obsolete" during Failover (the replication before several requests) and should prepare for processing non-latest session content. (For example, if step 1 of an interview generates a session attribute, and the user fails to transfer the request to a system with session state replication before two requests, then the servlet in step 2CodeYou should be prepared to not find this attribute in the session and take corresponding actions, such as redirection, instead of identifying it as there and throwingNullpointerexception.)

Container support

Servlet ContainerHttpsessionThe replication options and how to configure these options are different. IBM WebSphere provides the most replication options, it provides in-memory replication or database-based replication, end-to-end servlet or time-based replication time, and transmission of all session snapshots (JBoss 3.2 or later) or spread only the changed attributes. Memory-based replication is based on JMS publishing-subscription, which can be copied to all clones, a "partner" replica, or a dedicated replication server.

WebLogic also provides a set of options, including memory (using a partner copy), file-based, or database-based. When JBoss is used together with tomcat or jetty servlet containers for memory-based replication, you can select the servlet end or time-based replication time, And the snapshot option (in JBoss 3.2 or later versions) is to copy only the changed attributes. Tomcat 5.0 provides memory-based replication for all cluster nodes. In addition, through projects like Wadi, you can use the servlet filter mechanism to add session replication to servlet containers like tomcat or jetty.

Improve the Performance of distributed Web Applications

No matter what mechanism is used for session replication, you can improve the performance and scalability of Web applications in several ways. First, remember that in order to get the benefits of session replication, You need to mark the web application as distributable In the deployment descriptor and ensure that all content in the session is serializable.

Minimum Session persistence

Because the cost of copying a session increases with the increase of the object graph in the session, we should try to place less data in the session. This will reduce the replication serialization overhead, network bandwidth requirements, and disk requirements. In particular, it is not a good idea to store shared objects in sessions because they need to be copied toEverySession.

Do not bypass setattribute

When changing the attributes of a session, you must know that even if the servlet container only tries to make the smallest Update (only propagation of changed attributes ),SetattributeThe container may not notice the changed attributes. (Imagine there isVector, Indicates the item in the shopping cart -- if you callGetattribute ()ObtainVectorAnd then add some content to it without calling it again.Setattribute, The container may not realizeVectorChanged .)

Use refined session attributes

For containers that support minimum update, you can lower the cost of session replication by placing multiple refined objects instead of a large header in the session. In this way, changes to the rapidly changed data will not force the container to serialize and spread the slowly changed data.

Make it invalid after completion

If you know that the user has used the session (for example, the user chooses to log out), make sure to callHttpsession. invalidate (). Otherwise, the session persists until it expires, which consumes memory and may be long (depending on the Session Timeout ). Many servlet containers have a limit on the number of memories that can be used across all sessions. When this limit is reached, the first session used will be serialized and written to the disk. If you know that the user has used up the session, the container can stop processing it and invalidate it.

Keep sessions clean

If there are large items in the session and they are only used in part of the session, delete them when they are no longer needed. Deleting them reduces the cost of session replication. (This approach is similar to using explicit nulling to help the Garbage Collector. Old readers know That I generally do not recommend this method, but in this case, because there is replication, the cost of keeping garbage in sessions is much higher, so the value can help the container in this way .)

Conclusion

PassHttpsessionReplication, servlet containers can reduce the burden on you in building replicated, high-availability web applications. However, there are some configuration options for replication. Each container is different. The selection of replication policies has an impact on application fault tolerance, performance, and scalability. The selection of a replication policy should not be an afterthought-you should consider it when building a web application. In addition, do not forget to perform load tests to determine application scalability-before the customer performs for you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.