LRU of cache elimination algorithm

Source: Internet
Author: User
1. LRU
1.1. Principles

LRU (least recently used, least recently used)AlgorithmThe core idea of eliminating data based on the historical access records of data is that "if the data has been accessed recently, the chances of being accessed will be higher in the future ".

1.2. Implementation

The most common implementation is to use a linked list to save cached data. The detailed algorithm implementation is as follows:

1. Insert new data into the head of the linked list;

2. When the cache hits (that is, the cache data is accessed), the data is moved to the head of the linked list;

3. When the linked list is full, the data at the end of the linked list is discarded.

1.3. Analysis

[Hit rate]

When there is hotspot data, the LRU efficiency is good, but occasional and periodic batch operations will cause a sharp decrease in the LRU hit rate and serious cache pollution.

[Complexity]

Easy to implement.

[Cost]

When hit, you need to traverse the table, find the hit data block index, and then move the data to the header.

 

2. LRU-K2.1. Principle

The K in the LRU-K represents the number of recent uses, so LRU can be considered as a LRU-1. The main purpose of LRU-K is to solve the problem of LRU algorithm "cache pollution, its core idea is to extend the criterion for judging "Once recently used" to "K times recently used ".

2.2. Implementation

Compared to LRU, The LRU-K requires an additional queue to record the history of All cached data being accessed. Data is cached only when the number of Data Accesses reaches K. When data needs to be eliminated, the LRU-K will remove the data with the maximum K access time from the current time. The detailed implementation is as follows:

1. Data is accessed for the first time and added to the access history list;

2. If the data does not reach K times after the access history list, the data will be eliminated according to certain rules (FIFO, LRU;

3. when the number of data accesses to the Historical queue reaches K, the data index is deleted from the Historical queue, the data is moved to the cache queue, and the data is cached. The cache queue is re-sorted by time;

4. cache data queues are re-accessed and re-ordered;

5. When data needs to be eliminated, the data that is placed at the end of the cache queue should be eliminated, that is, the "Last-to-last K access time" data should be eliminated.

LRU-K has the advantages of LRU, at the same time can avoid the shortcomings of LRU, in practical application, LRU-2 is the best choice after a variety of factors, LRU-3 or greater K value hit rate will be high, but the adaptability is poor, A large amount of data access is required to clear historical access records.

2.3. Analysis

[Hit rate]

LRU-K reduces the problem of "cache pollution", with a higher hit rate than LRU.

[Complexity]

LRU-K queue is a priority queue with high algorithm complexity and cost.

[Cost]

Since the LRU-K also needs to record the objects that have been accessed but not put into the cache, the memory consumption will be more than LRU; when the amount of data is large, the memory consumption will be considerable.

LRU-K needs to be sorted Based on Time (can be sorted when it needs to be eliminated, or can be sorted in real time), CPU consumption is higher than LRU.

3. Two queues (2q) 3.1. Principle

The two queues (which is replaced by 2q below) algorithm is similar to a LRU-2, with the difference that 2q changes the access history queue in the LRU-2 algorithm (note that this is not the cache data) to a FIFO cache queue, namely: the 2q algorithm has two cache Queues: one is a FIFO queue and the other is an LRU queue.

3.2. Implementation

When the data is accessed for the first time, the 2q algorithm caches the data in the FIFO queue. when the data is accessed for the second time, the data is moved from the FIFO queue to the LRU queue, the two queues use their own methods to remove data. The detailed implementation is as follows:

1. Newly accessed data is inserted into the FIFO queue;

2. If the data has not been accessed again in the FIFO queue, the data will be eliminated according to the FIFO rules;

3. If the data is accessed again in the FIFO queue, move the data to the LRU queue header;

4. If the data is accessed again in the LRU queue, move the data to the LRU queue header;

5. data at the end of the LRU queue is eliminated.

 

Note: The FIFO queue is shorter than the LRU queue, but this does not mean this is an algorithm requirement. In actual application, there is no hard limit on the ratio of the two.

3.3. Analysis

[Hit rate]

The 2q algorithm has a higher hit rate than LRU.

[Complexity]

Two queues are required, but both queues are relatively simple.

[Cost]

The sum of the cost of FIFO and LRU.

The hit rate of 2q and LRU-2 algorithms is similar, memory consumption is also relatively close, but for the last cached data, 2q will reduce the operation of reading data from the original storage or computing data.

4. multi Queue (MQ) 4.1. Principle

The MQ algorithm divides data into multiple queues Based on the Access frequency. Different queues have different access priorities. The core idea is to cache data with a large number of accesses.

4.2. Implementation

The MQ algorithm divides the cache into multiple LRU queues. Each queue has different access priorities. The access priority is calculated based on the number of visits, for example

The detailed algorithm structure is as follows: q0, Q1 .... qk indicates a queue of different priorities, and Q-history indicates a queue that removes data from the cache but records the index and reference times of the data:

 

For example, the algorithm is described as follows:

1. Add the newly inserted data to q0;

2. Each queue manages data according to LRU;

3. When the number of Data Accesses reaches a certain level and the priority needs to be increased, delete the data from the current queue and add it to the header of the higher-level queue;

4. to prevent high-priority data from being eliminated, when the data is not accessed at the specified time, you need to reduce the priority and delete the data from the current queue, added to the lower-level queue header;

5. When data needs to be eliminated, it is eliminated from the lowest-level Queue according to LRU. when data is eliminated in each queue, the data is deleted from the cache and the data index is added to the Q-history header;

6. If the data is re-accessed in Q-history, the priority is recalculated and moved to the header of the target queue;

7. Q-history removes data indexes based on LRU.

4.3. Analysis

[Hit rate]

MQ reduces the problems caused by "cache pollution", and the hit rate is higher than that of LRU.

[Complexity]

MQ needs to maintain multiple queues and maintain the access time of each data, which is more complex than LRU.

[Cost]

MQ needs to record the access time of each data and regularly scan all queues at a higher cost than LRU.

Note: Although MQ queues seem to have many queues, the sum of all queues is limited by the cache capacity. Therefore, the sum of multiple queue lengths is the same as that of one LRU queue, therefore, the queue scanning performance is similar.

 

5. Comparison of LRU Algorithms

The hit rate varies greatly due to different access models. Here, the comparison is only based on theoretical qualitative analysis, without quantitative analysis.

Comparison

Comparison

Hit rate

LRU-2> MQ (2)> 2q> LRU

Complexity

LRU-2> MQ (2)> 2q> LRU

Cost

LRU-2> MQ (2)> 2q> LRU

In actual applications, You need to select data access based on business needs. The higher the hit rate, the better. For example, although the LRU seems to have a lower hit rate and a "cache pollution" problem, the LRU may be used more in practical applications due to its simplicity and low cost.

 

The simplest LRU Algorithm Implementation in Java is to use JDK's javashashmap to overwrite the removeeldestentry (Map. Entry) method.

If you look at the linkedhashmap source code, we can see that the LRU algorithm is implemented through a two-way linked list. When a position is hit, you can adjust the direction of the linked list to adjust the position to the header position, the newly added content is directly placed in the head of the linked list. As a result, the recently hit content is moved to the head of the linked list. To replace the content, the last position of the linked list is the least recently used position.
Import Java. util. arraylist; import Java. util. collection; import Java. util. linkedhashmap; import Java. util. concurrent. locks. lock; import Java. util. concurrent. locks. reentrantlock; import Java. util. map;/*** class description: The removeeldestentry method must be implemented to implement simple caching using javashashmap, for more information, see JDK documentation ** @ author Dennis ** @ Param <k> * @ Param <v> */public class lrulinkedhashmap <K, V> extends linkedhashmap <K, v> {private final int maxcapacity; Private Static final float default_load_factor = 0.75f; private final lock = new reentrantlock (); Public Transport (INT maxcapacity) {super (maxcapacity, default_load_factor, true); this. maxcapacity = maxcapacity;} @ override protected Boolean removeeldestentry (Java. util. map. entry <K, V> eldest) {return size ()> maxcapacity ;}@ override public Boolean containskey (Object key) {try {lock. lock (); return Super. containskey (key);} finally {lock. unlock () ;}@ override public v get (Object key) {try {lock. lock (); return Super. get (key) ;}finally {lock. unlock () ;}@ override public v put (K key, V value) {try {lock. lock (); return Super. put (Key, value);} finally {lock. unlock () ;}} public int size () {try {lock. lock (); return Super. size ();} finally {lock. unlock () ;}} public void clear () {try {lock. lock (); super. clear ();} finally {lock. unlock () ;}} public collection <map. entry <K, V> getall () {try {lock. lock (); return New arraylist <map. entry <K, V> (super. entryset ();} finally {lock. unlock ();}}}

 

LRU implementation based on double-stranded tables:

In the traditional sense, the LRU algorithm sets a counter for each cache object. In each cache hit, the counter is + 1. When the cache is used up, the old content needs to be eliminated and new content needs to be placed, view all the counters and replace the least used items.

Its drawbacks are obvious. If the number of caches is small, the problem will not be very high, but if the cache space is too large, it will reach or more. If it needs to be eliminated, it will need to traverse all the calculators, its performance and resource consumption are huge. Efficiency is very slow.

Its principle: connects all locations of the cache to a dual-connected table. When a position is hit, it will adjust the orientation of the linked list to adjust the position of the linked list header, the newly added cache is directly added to the linked list header.

In this way, after multiple cache operations, the recently hit items will be moved in the direction of the linked list header, but will not be hit, but will move behind the linked list, the end of the linked list indicates the cache with the least recent usage.

When you need to replace the content, the last position of the linked list is the least hit position. We only need to remove the last part of the linked list.

So many theories are mentioned above.CodeTo implement the cache of an LRU policy.

We use an object to represent the cache and implement double-stranded tables,

Public class lrucache {/*** linked list node * @ author administrator **/class cachenode {......} Private int cachesize; // cache size private hashtable nodes; // cache container private int currentsize; // the current number of cached objects private cachenode first; // (implement double-stranded tables) the chain table header private cachenode last; // (implements a double-Link Table) the end of the chain table}

The complete implementation is provided below. This class is also used by Tomcat (Org. apache. tomcat. util. collections. lrucache), but it has been discarded in tomcat6.x, and is replaced by other cache classes.

Public class lrucache {/*** linked list node * @ author administrator **/class cachenode {cachenode Prev; // The previous node cachenode next; // The object Value of the next node; // value: Object key; // key: cachenode () {}} public lrucache (int I) {currentsize = 0; cachesize = I; nodes = new hashtable (I ); // cache container}/*** get the cached object * @ Param key * @ return */public object get (Object key) {cachenode node = (cachenode) nodes. get (key); If (node! = NULL) {movetohead (node); Return node. value;} else {return NULL;}/*** add cache * @ Param key * @ Param value */Public void put (Object key, object value) {cachenode node = (cachenode) nodes. get (key); If (node = NULL) {// whether the cache container has exceeded the size. if (currentsize> = cachesize) {If (last! = NULL) // Delete the least used nodes. remove (last. key); removelast ();} else {currentsize ++;} node = new cachenode ();} node. value = value; node. key = key; // place the latest node in the linked list header to indicate the latest node. movetohead (node); nodes. put (Key, node);}/*** Delete the cache * @ Param key * @ return */public object remove (Object key) {cachenode node = (cachenode) nodes. get (key); If (node! = NULL) {If (node. Prev! = NULL) {node. Prev. Next = node. Next;} If (node. Next! = NULL) {node. next. prev = node. prev;} If (last = node) Last = node. prev; If (first = node) First = node. next;} return node;} public void clear () {First = NULL; last = NULL ;} /*** Delete the tail node of the linked list * indicates that the minimum used cache object is deleted */private void removelast () {// If the tail of the linked list is not empty, the end of the linked list is pointed to null. delete the end of a connected table (delete the least used cached object) if (last! = NULL) {If (last. Prev! = NULL) Last. prev. next = NULL; elsefirst = NULL; last = last. prev;}/*** move to the linked list header, indicating that this node is the latest * @ Param node */private void movetohead (cachenode node) {If (node = first) return; If (node. prev! = NULL) node. Prev. Next = node. Next; If (node. Next! = NULL) node. Next. Prev = node. Prev; If (last = node) Last = node. Prev; If (first! = NULL) {node. next = first; first. prev = node;} First = node; node. prev = NULL; If (last = NULL) Last = first;} private int cachesize; private hashtable nodes; // cache container private int currentsize; private cachenode first; // chain table header private cachenode last; // chain table end}



Reference: http://blog.csdn.net/yah99_wolf/article/details/7599671

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.