E-commerce project VI: distributed session

Source: Internet
Author: User
Tags failover object serialization
@ Zheng yu Summary This solution is similar to that for Distributed caching in high concurrency and high availability. I. Illustration: Ii. Problems to be Solved in Distributed sessions with high concurrency:
    • Transparent handling of failover of storage media
    • Dynamically add and delete nodes to reduce cache bumps
    • Ensure balanced distribution of data on each node
    • Session Serialization and deserialization
3. Ensure that the "basically available" distributed session scheme is basically available: Eric A. Brewer's base strategy in 1988, that is   Basically available , Soft State , And Eventually consistent . Most internet applications emphasize availability, that is, they sacrifice High Consistency To achieve availability or reliability.
Basic availability Basically available Definition:When part of the distributed system is damaged, some content is allowed to be unavailable, but other parts are still available. Therefore, this system is called "Basic available ". For example, a data storage system consists of five nodes. If one of them is damaged, only 20% of the data cannot be accessed, and the other 80% of the data is still available. The system can be called Basic available.
Memcache-based   Hash moduloAlgorithm (Hash () mod N, hash () takes the user ID, n is the number of nodes) distributed session solution, is basically available:
First, if a node fails, all user sessions on the node are lost, and the system cannot recover itself. Second, if the system pressure suddenly increases, you need to temporarily add machine nodes. According to the hash modulo algorithm, a large number of caches cannot hit (in fact, they still exist on previous nodes) at this time when machine nodes are added, resulting in a wide range of cache penetration, the pressure is directly routed to the database. Third, according to the LRU cache invalidation algorithm, the key/value stored in memcache may be kicked out, and user sessions may be lost.
The method for improving hash modulo is as follows: Iv. memcache solution based on consistent Hash Algorithm 1) Consistent hash helps us solve the problem of least reconstruction of cached data when the number of machine nodes decreases. 2) It can also solve the issue of session data distribution and balancing. 3) when the machine node goes down, this part of data will inevitably be lost. Due to the change in the number of nodes, it is possible to reconstruct some data that has not been lost. However, none of the above solutions can be solved. After a node fails, how can the session stored by the node be obtained by other nodes to replace the failed node and implement fault tolerance (Failover) of the cluster) ". Zheng introduced the following concepts: 5. Sticky session, non-sticky session, and replicated sessions
    • Sticky sessions: Sticky session. That is, requests in the same session must be forwarded to the same node, and will not be forwarded to the Failover node unless the node goes down. When a node goes down, the stored sessions is completely lost. In other words, the user is "glued" to a server node.
    • Non-sticky sessions: Non-sticky session. Each request may be forwarded to different nodes.
    • Replicated sessions: Copies sessions on one node to other nodes in the cluster to prevent data loss and allow seamless failover. For example, copy node 0 to node 5, copy Node 1 to node 6, and so on. Most application servers (such as Tomcat) support session replication.
When the number of users and the number of clusters reaches a certain scale, session replication may become a performance bottleneck. So people proposed Recover invalid node data from a third-party Cache  Open-source products   Memcached-session-Manager (hereinafter referred to as MSM) is based on this idea. Vi. How MSM works MSM supports Tomcat 6 and 7, which mainly addresses the high availability of Tomcat. It features:
    • Supports sticky sessions and non-Ticky sessions.
    • No spof.
    • Able to handle Tomcat failover
    • Able to handle memcache failover
    • Pluggable session serialization
    • Allows asynchronous session storage to increase response speed
    • Sessions is sent to memcache only when it is modified.
6.1.sticky Working Principle in session Mode That is, The local session of Tomcat is the master session, and the session in memcache is the backup session. .
Step 1: Install MSM on all Tomcat nodes. Each Tomcat node has its own local session. Step 2: After a request is executed, if the corresponding session does not exist locally (that is, this is the first request of a user), copy the session to memcache. Step 3: when the next request for this session arrives, the local session of Tomcat will be used. After the request processing is complete, the session changes are synchronously updated to memcache to ensure data consistency. Step 4: if the current Tomcat node fails, the next request will be routed to another tomcat node. This tomcat finds that the session corresponding to the request does not exist, so it will query memcache. If it is queried, it will be restored to the local session.
This completes fault tolerance. 6.2.working Principle of Non-sticky session Mode That is, The local session of Tomcat is a transit session, memcache 1 is the master session, and memcache 2 is the backup session .
Step 1: receive the request and load the standby session to the local container;
If the standby session fails to be loaded, it is loaded from the primary session;
Step 2: After the request processing is complete, the session changes will be synchronously updated to memcache 1 and memcache 2, and the local session of Tomcat will be cleared.
Session data must be serialized and deserialized to be stored in memcache. VII. kryo-based serialization Scheme All serialization policies must provide the following features:
    • Serialization: it can process circular references.
    • Serialization/deserialization: supports reference to a shared object.
    • Deserialization: private classes is supported.
    • Deserialization: supports classes without default constructor.
The following is the MSM   Table listed on Wiki:
Serialization strategy 

Value Transcoderfactoryclass Attribute

Requires 

Java. Io. serializable

Cyclic

Dependencies

Shared

Objects

Private classes Classes

Default constructor

Different class versions Copy collections 

Before

Serialization

Custom 

Converter

Comment
Java serialization(Default, bundled with MSM) 

De. javakaffee. Web. MSM. javaserializationtranscoderfactory

Yes Yes Yes Yes Yes No (though, 
If the serialversionuid is set to 1l, 
Classes can be deserialized 
Even if the new class version has new fields)
No No  
MSM-kryo-serializer 

De. javakaffee. Web. MSM. serializer. kryo. kryotranscoderfactory

No Yes Yes Yes (for Sun JVMs) Yes (for Sun JVMs) No (not yet) Yes Yes (converter must extend Kryocustomization, Serializerfactory Or Unregisteredclasshandler) Reflection based, Kryo Is used for binary serialization/deserialization
MSM-javolution-serializer 

De. javakaffee. Web. MSM. serializer. javolution. javolutiontranscoderfactory

No Yes Yes Yes (for Sun JVMs) Yes (for Sun JVMs) Yes (during deserialization, fields that are not existing in a class are ignored) Yes Yes (converter must extend [apidocs/javolution/XML/customxmlformat.html customxmlformat]) Reflection based, Javolutionis used for actual XML encoding/decoding, it also does the object reference handling
The author's point of view is:
    1. Java serialization is a very robust and widely proven technology, but IMHO (IMHO) cannot handle the version of the class: how to deserialize the data streams serialized by the old version when the new version is backward compatible, and how to deserialize the data streams serialized by the new version if the new version is to be backward compatible. To verify compatibility, the test volume is the square of the number of versions.
    2. Kryo is a very fast binary serialization library. In Performance benchmark of thrift and protobuf , Kryo is also one of the fastest serialization tool libraries. He recommends kryo because of exceptional performance.
8. Distributed session solution based on zookeeper Cluster To solve the data loss problem based on the memcache solution, you can introduce the persistent storage medium zookeeper (zk ). Based on ZK's consistent replication (ensuring strong data consistency among multiple replicas) and fault tolerance capabilities, ZK is responsible for session data storage in combination with the above MSM ideas, our own session manager will be responsible for session lifecycle management. 9. solutions provided by Microsoft ASP. NET has its own distributed session solution: session state server, that is, the mode of sessionstate specified in Web. config is "StateServer. Zheng Yu can be used on the web. specify a State server in config: You can also implement system. web. the ipartitionresolver interface determines how to construct the session state server connection string to support a group of State servers. Zheng Jing can also set the partitionresolvertype attribute of sessionstate: Microsoft's solution disadvantages: serialization and deserialization objects in session state will become one of the main performance consumption, it is best to use the basic type to store all session state data.
Reference resources:
1) Open Source code , Http://code.google.com/p/memcached-session-manager2) jacktan, 2011, based on zookeeper distributed session Implementation 3) Maarten balliauw, 2008, Asp. net session state partitioning4) timyang, 2009, some issues related to the practice of consistent hash for a distributed application 5) build an efficient and non-spof distributed session service 7) developerworks, 2010, five things you don't know about Java object serialization
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.