Storm common mode-timecachemap

Source: Internet
Author: User

Storm uses a data structure called timecachemap to store recently active objects in the memory. It is highly efficient and automatically deletes expired and inactive objects.

Timecachemap uses multiple buckets to narrow the lock granularity, in exchange for high concurrent read/write performance. Next, let's take a look at how timecachemap is implemented internally.

1. Implementation Principle

Bucket linked list: each element in the linked list is a hashmap used to save data in key and value formats.

 
PrivatePartition list  

Lock Object: used for get/put operations on timecachemap to ensure atomicity.

 
Private FinalObject _ Lock =NewObject ();

Background cleanup thread: clears data after timeout.

 
PrivateThread _ cleaner;

Time-out callback interface: this interface is used to perform function callback after time-out and perform some other processing.

Public Static InterfaceExpiredcallback <K, V>{Public VoidExpire (K key, V Val );}PrivateExpiredcallback _ callback;

With the above data structure, let's take a look at the specific implementation of the constructor:

1. First, initialize a specified number of buckets and store them as chained linked lists. Each bucket contains an empty hashmap;

2. Then, set the cleanup thread. The process is as follows:

A) sleep expirationmillis/(numBuckets-1) millisecond time (I .e. expirationsecs/(numBuckets-1) S );

B) Lock the _ Lock Object and remove the last element from the buckets linked list;

C) add an empty hashmap bucket to the head of the buckets linked list to remove the _ Lock Object lock;

D) if the callback function is set, callback is performed.

     Public Timecachemap ( Int Expirationsecs, Int Numbuckets, expiredcallback <K, V> Callback ){  If (Numbuckets <2 ){  Throw   New Illegalargumentexception ("numbuckets must be> = 2" );} _ Buckets = New Partition list ();  For ( Int I = 0; I <numbuckets; I ++ ) {_ Buckets. Add (  New Hashmap <K, V> ();} _ Callback = Callback;  Final   Long Expirationmillis = expirationsecs * 1000l ;  Final   Long Sleeptime = expirationmillis/(numBuckets-1); _ Cleaner = New Thread ( New  Runnable (){  Public   Void  Run (){  Try  {  While ( True  ) {Map <K, V> dead = Null ; Time. Sleep (sleeptime );  Synchronized  (_ Lock) {dead = _ Buckets. removelast (); _ buckets. addfirst (  New Hashmap <K, V> ());}  If (_ Callback! = Null  ){  For (Entry <K, V>Entry: dead. entryset () {_ callback. expire (entry. getkey (), entry. getvalue ());}}}}  Catch  (Interruptedexception ex) {}}); _ cleaner. setdaemon (  True  ); _ Cleaner. Start ();} 

The constructor must pass three parameters: expirationsecs: timeout time, in seconds; numbuckets: Number of buckets; callback: timeout callback function.

For ease of use, three types of constructor are provided, which can be selected as needed:

     //  This default ensures things expire at most 50% past the expiration time      Private  Static   Final   Int Default_num_buckets = 3 ;  Public Timecachemap ( Int Expirationsecs, expiredcallback <K, V> Callback ){  This  (Expirationsecs, default_num_buckets, callback );}  Public Timecachemap ( Int Expirationsecs, Int  Numbuckets ){ This (Expirationsecs, numbuckets, Null  );}  Public Timecachemap ( Int  Expirationsecs ){  This  (Expirationsecs, default_num_buckets );} 
2. Performance Analysis

Get operation: traverses each bucket. If a specified key exists, it is returned. The time complexity is O (numbuckets)

PublicV get (K key ){Synchronized(_ Lock ){For(Hashmap <K, V>Bucket: _ buckets ){If(Bucket. containskey (key )){ReturnBucket. Get (key );}}Return Null;}}

Put operation: Put the key, value in the first bucket of _ buckets, then traverse other numBuckets-1 buckets, remove the records whose key is key from hashmap, the time complexity is O (numbuckets)

Public VoidPut (K key, V value ){Synchronized(_ Lock) {iterator<Hashmap <K, V> it =_ Buckets. iterator (); hashmap<K, V> bucket =It. Next (); bucket. Put (Key, value );While(It. hasnext () {Bucket=It. Next (); bucket. Remove (key );}}}

Remove operation: traverses each bucket. If a record with the key as the key exists, it is deleted directly. The time complexity is O (numbuckets)

PublicObject remove (K key ){Synchronized(_ Lock ){For(Hashmap <K, V>Bucket: _ buckets ){If(Bucket. containskey (key )){ReturnBucket. Remove (key );}}Return Null;}}

Containskey operation: traverses buckets. If a specified key exists, true is returned. Otherwise, false is returned. The time complexity is O (numbuckets)

Public BooleanContainskey (K key ){Synchronized(_ Lock ){For(Hashmap <K, V>Bucket: _ buckets ){If(Bucket. containskey (key )){Return True;}}Return False;}}

Size operation: traverses buckets and accumulates the hashmap size of each bucket. the time complexity is O (numbuckets)

Public IntSize (){Synchronized(_ Lock ){IntSize = 0;For(Hashmap <K, V>Bucket: _ buckets) {size+ =Bucket. Size ();}ReturnSize ;}}
3. Timeout

After analyzing put operations and _ cleaner threads, we know that:

A) The put operation places the data in the first bucket of _ buckets, then traverses the buckets of other numBuckets-1, and removes the record whose key is key from hashmap;

B) The _ cleaner thread removes the data in the last bucket of _ buckets from timecachemap every expirationsecs/(numBuckets-1) second.

Therefore, if the _ cleaner thread just clears data and the put function call puts the key in the bucket, the timeout time for a piece of data is:

Expirationsecs/(numBuckets-1) * numbuckets = expirationsecs * (1 + 1/(numBuckets-1 ))

However, if the put function call is just completed and the _ cleaner thread begins to clean up data, the timeout time for a piece of data is:

Expirationsecs/(numBuckets-1) * numbuckets-expirationsecs/(numBuckets-1) = expirationsecs

4. Summary

1. The efficiency of timecachemap is that the lock granularity is small. The O (1) Time can complete the lock operation. Therefore, get and put operations can be performed most of the time.

2. The get, put, remove, containskey, and size operations can be completed within the O (numbuckets) Time. numbuckets is the number of buckets. The default value is 3.

3. The time-out for unupdated data is between expirationsecs and expirationsecs * (1 + 1/(numBuckets-1.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.