Summary of working principle of memcache
a <!--[if!supportlists]-->1. <!--[endif]--> sharding principle
We don't say much nonsense, just look at the principle of memcache.
First of all, Memcache solves the biggest problem: memory fragmentation problem with memory read multiple times. Memory fragmentation is divided into internal memory fragmentation and memory external fragmentation. It is generally referred to in the external fragment of a discontinuous small memory fragments, can not be exploited by the process. Because of discontinuous, can not be combined into large and continuous space, resulting in this part of the space is a pity waste. Memory fragmentation occurs because a block of memory is allocated so that it is idle, but not the free memory is returned to the maximum memory block.
Then memcache start the process with a predetermined size (by default, 64MB) phase memory to open up a contiguous memory space, and then the memory space into different fragments.
I believe we've seen the following figure.
650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0094/5679/ 542176f4-1274-35cc-8f53-63d353a70ef2.png "style=" Border:none;height:auto; "/>
memcache The maximum open memory of a single process is 2GB, if you want to cache more data, it is recommended to open up more memcache processes (different ports) or use distributed memcache for caching, caching data to different physical or virtual machines.
The memcache process starts and the memory opens up a contiguous area. Let's use the figure above for example, this continuous area is like the Slab1+slab2+slab3+......+slab (n) above. The same distribution area makes up the slab (shard Group).
slab Below is not directly the storage area slice (chunks in the picture) . Instead of the page, if a new cache data is to be stored, memcached first selects a suitable slab and then checks to see if the slab has free chunk, and if so, it is stored directly; Slab requests Memory in page, so when the first data is placed, regardless of size, a page of 1M size is assigned to the slab. After applying to the page, slab will slice the page's memory by the size of chunk, thus becoming an array of chunk, selecting one from the chunk array to store the data. In the page is the
A small storage unit--chunks, a page default 1MB, then you can put how many 88-byte units of chunks it? 1024*1024/88 is approximately equal to 11,915. If the put record is a 100-byte data, then how to allocate it in 88 bytes of chunks and 112 bytes of chunks. The answer, of course, is Jinzhao, and it is impossible to decompose, separate, store, or merge the requested data. This also brings a small problem, or there is a waste of space. 112-100=12 bytes, this 12 bytes is wasted .
In terms of cache cleanup, Memcache is not releasing allocated memory. Once the allocated memory is in the void, Memcache will naturally be reused for this past memory space. As for the expiration, it is also the time to take a get to the memory data to take a query timestamp to see if the timeout has expired. There are basically no other threads that interfere with the life cycle of the data. The purge strategy is equivalent to the default policy of Ehcache, which has seldom been used recently-the Lru--least recently used commonly used in English.
The Memcache identifies the condition under which the memory is insufficient: The new storage unit cannot be retrieved from the slab . This is very greedy for memory stuff. Basic servers have to be 2~4GB to eat (non-timeliness, or less-than-timeliness data).
Memcache uses the operating system's libevent tools for efficient reading and writing . LIbevent is a library that encapsulates the event processing functions of Linux epoll, BSD-type operating systems, Kqueue, and so on as a unified interface. High performance even if the number of connections to the server increases. Memcached uses this libevent library, so it can perform its high performance on Linux, BSD, Solaris and other operating systems. Memcache claims to be able to accept any number of connection requests. Is that really the case?
two <!--[if!supportlists]-->1. <!--[endif]--> stored procedure Analysis
Suppose we now store a cache record in the Memcache, first, when using the Memcache client program, we need to develop an initialized service machine routing table, such as the Java client program
Cachedclient = new Memcachedclient (); Get connection Pool instance Sockiopool pool = sockiopool.getinstance (); Set the cache server address, you can set multiple implementations of distributed cache Pool.setservers (new string[]{"127.0.0.1:11211", "192.176.17.90:11211"});
Then when doing the storage Memcache client program will hash out a code, and then according to the routing table to send the request to the Memcache server, that is, Memcache client program equivalent to do a similar load-balancing function. The following figure is what we've seen before.
650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0094/5683/ Eb3b85b6-b62f-3576-950e-a4850baf3c64.png "style=" Border:none;height:auto; "/>
The process of memcache on the server is only responsible for monitoring the service and accepting requests and storing the data. Distribution does not belong to his pipe. So if you look at it this way, it is a difficult point for the client code to hash to each memcache service machine, so that the uniform distribution of each machine is evenly stored. This time hash hashing algorithm is particularly important.
<!--[if!supportlists]-->1. <!--[endif]--> reading process Analysis
Understanding the storage of memcache is not difficult to understand the memcache read cache process. At the time of reading is also based on the key to calculate a hash, after calculating the specified route physical machine location, and then distribute the request to the server machine.
650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0094/5686/ 2e29d67c-01a7-3b5f-9ad9-428d37cc55d5.png "style=" Border:none;height:auto; "/>
Memcache distributed Read and write storage methods have pros and cons. If the node2 is down, then the Node2 cache data is gone, then the data must be load out of the database, re-request to a cache physical machine based on the routing table (only Node1 and node3 at this time), in the cache machine written to redirect. Disaster recovery has been achieved more fully. The downside is that maintaining such a high-availability cache is a bit expensive. In order to store more data, whether this is more beneficial than the harm, or depends on the specific application scenario to be determined.
<!--[if!supportlists]-->1. <!--the controversy between [endif]--> and Ehcache
Ehcache's argument has been made before, always on the performance of these two performance. There are a list of common people on the web that are compared. The author thinks, these two biggest difference is the principle difference decides the application scene difference. For example, a single point of application caching, you can use Ehcache directly to the local memory cache read and write. When doing the cluster cache is generally by a centralized management server to do the cache, since it is centralized server network transmission is not necessary, this time Hou memcache more suitable. Not to say that Ehcache can not do the cluster cache, but the cost of the cluster cache (RMI, JMS, JGroups), network resources occupy is indeed higher than memcache. As for the efficiency of the memory read and write operation, this is not very good. Ehcache uses Java's Random read-write class to manipulate the binary buffer. The Memcache base is a C service based on the Libevent program library. This belief is almost as efficient. The critical consumption is still on the network IO resource.
This article from "Li Shilong" blog, declined reprint!
I take you to understand memcache work principle summary