Membercached study on "Turn"

Source: Internet
Author: User
Tags epoll memcached dedicated server jboss

Origin: In data-driven web development, it is often repeated to remove the same data from the database, which greatly increases the database load. Caching is a good way to solve this problem.
What is memcached?
Memcached is a high-performance, distributed memory object caching system developed by Danga Interactive for reducing database load and increasing access speed in dynamic applications.

What is Memcache?

Memcache is a danga.com project, the first to serve LiveJournal, many people around the world use this cache project to build their own heavy load of the site, to share the pressure of the database.

It can handle any number of connections, using non-blocking network IO. Because it works by opening up a space in memory, it then builds a hashtable,memcached self-managing these hashtable.

Memcache is a high-performance distributed memory cache server. The general purpose is to reduce the number of database accesses by caching database query results to improve the speed and scalability of dynamic Web applications.

Why are there two names of Memcache and memcached?

In fact Memcache is the name of this project, and memcached is its server-side main program file name,

Memcache Official website: http://www.danga.com/memcached

Memcache Working principle

First of all, memcached is running on one or more servers in the daemon mode, accepting the client's connection operation at any time, the client can be written in various languages, the currently known client API includes perl/php/python/ruby/java/c#/c and so on. After the client has established a connection to the Memcached service, the next thing is to access the object, each object that is accessed has a unique identifier key, the access operation is through this key, the object saved to memcached is actually placed in memory, not saved in the Cache file, which is why memcached can be so efficient and fast. Note that these objects are not persistent and the data inside is lost after the service is stopped.

Like many cache tools, the Memcached principle is not complicated. It uses the C/S mode, on the server side to start the service process, at startup can specify the listening IP, its own port number, the size of the memory used and several key parameters. Once started, the service is always in a usable state. The current version of Memcached is implemented via C, using a single-process, single-threaded, asynchronous I/O, event-based (event_based) service approach. Use Libevent as an event notification implementation. Multiple servers can work together, but there is no communication connection between these servers, and each server simply manages its own data. Client side by specifying the IP address of the server side (through the domain name should also be possible). The object or data that needs to be cached is saved on the server side in the form of a key->value pair. The value of key is converted by hash, and value is passed to the corresponding specific Server according to the hash value. When object data needs to be obtained, it is also based on key. The key is hashed first, and the obtained value determines which server it is stored on, and then makes a request to the server. The client only needs to know which server the value of the hash (key) will be stored on.

In fact, Memcache's job is to maintain a huge hash table in the memory of a dedicated machine to store some of the arrays and files that are often read and written, thus greatly improving the efficiency of the site.

Characteristics of memcached

Memcached, as a distributed cache server running at high speed, has the following characteristics.

    • Simple protocol
    • Libevent-based event handling
    • Built-in memory storage mode
    • Memcached distributed without communication with each other
Simple protocol

memcached Server client communication does not use a format such as complex XML, but uses a simple text-line-based protocol. As a result, you can also save data and get data on memcached by using Telnet. Here is an example.

$ telnet localhost 11211Trying 127.0.0.1...Connected to Localhost.localdomain (127.0.0.1). Escape character is ' ^] '. Set foo 0 0 3     (Save command) bar               (data) STORED            (result) get foo           (get command) VALUE Foo 0 3     (data) bar
    (data)

The protocol document is located in the source code of the memcached, or you can refer to the following URL.

    • Http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt
Libevent-based event handling

Libevent is a library that encapsulates event-handling functions such as Linux's Epoll, BSD-class operating system kqueue, and so on as a unified interface. The Performance of O (1) can be played even if the number of connections to the server increases. Memcached uses this libevent library, so it can perform its high performance on Linux, BSD, Solaris and other operating systems. About event handling is no longer detailed here, you can refer to Dan Kegel's c10k problem.

    • libevent : http://www.monkey.org/~provos/libevent/
    • The c10k problem : http://www.kegel.com/c10k.html
Built-in memory storage mode

To improve performance, the data saved in memcached is stored in Memcached's built-in memory storage space. Because the data exists only in memory, restarting the memcached and restarting the operating system will cause all data to disappear. Additionally, when the content capacity reaches the specified value, the unused cache is automatically deleted based on the LRU (Least recently used) algorithm. The memcached itself is a server designed for caching, so there is not too much consideration for permanent data issues. For more information on memory storage, the second part of this series will be introduced in the former Osaka, please refer to it at that time.

Memcached distributed without communication with each other

Memcached Although it is a "distributed" cache server, there is no distributed functionality on the server side. Each memcached does not communicate with each other to share information. So, how to distribute it? This depends entirely on the client's implementation. This series will also introduce the distribution of memcached.


What can memcached cache?
By maintaining a unified, huge hash table in memory, memcached can be used to store data in a variety of formats, including images, videos, files, and the results of database retrieval.

Memcached, are you quick?
Very fast. Memcached uses libevent (using Epoll under Linux if possible) to equalize any number of open links, use non-blocking network I/O, and implement reference counting for internal objects (so objects can be in various states for multiple clients). Use your own page block allocator and hash table, so virtual memory is not fragmented and the time complexity of virtual memory allocation is guaranteed to be O (1).
Danga Interactive developed memcached for the speed of Danga Interactive. Currently, LiveJournal.com has provided up to 20 million page visits per day to 1 million of users. These, however, are done by a cluster of Web servers and database servers. Memcached almost completely abandons the way any data is read from the database, and it also shortens the speed at which the user can view the page, better resource allocation, and access to the database when the memcache fails.

Features of memcached
The memcached cache is distributed and can be accessed simultaneously by multiple users on different hosts, thus solving the limitations of shared memory only for single-machine applications, and less disk overhead and blocking when using databases to do similar things.


Use of memcached
One, memcached server-side installation (it is installed as a system service here)
Download file: memcached 1.2.1 for Win32 binaries (Dec 23, 2006)
1 Extracting files to c:\memcached
2 command line input ' c:\memcached\memcached.exe-d install '
3 command line input ' c:\memcached\memcached.exe-d start ', the command starts memcached, the default listener port is 11211
You can view its help through memcached.exe-h

PS: Installation times error, if it is WIN7 system, then use the administrator to enter the cmd mode, find CMD.EXE, with the right button to select the administrator mode to enter on it.

memcached.exe -p 11211 -m 64m -vv    
-P The TCP port used. Default is 11211
-M Maximum memory size. Default is 64M
-vv Start with very vrebose mode, debug information and error output to console
-D

Start in the background as a daemon

-D Restart Restart memcached service

-D Stop|shutdown Close the running memcached service

-M running out of memory and returning an error instead of deleting an item

-C Maximum number of colleague connections, default is 1024

-F Block size growth factor, default is 1.25

-N Minimum allocation space key+value+flags default is 48

-H Display Help


Second, the client uses
Download memcached java client:http://www.whalin.com/memcached/#download
1 Add the Java_memcached-release_2.0.1.jar jar package to the classpath of the project after decompression
2 using memcached Java Client for a simple application

Java code
  1. public class Testmembercache {
  2. Create a unique instance of the global
  3. protected static Memcachedclient MCC = new Memcachedclient ();
  4. static {
  5. Server list and its weights
  6. String[] Servers = {"127.0.0.1:11211"};
  7. Integer[] weights = {3};
  8. Gets the instance object of the Socke connection pool
  9. Sockiopool pool = sockiopool.getinstance ();
  10. Setting Up Server information
  11. Pool.setservers (servers);
  12. Pool.setweights (weights);
  13. Set the number of initial connections, minimum and maximum connections, and maximum processing time
  14. Pool.setinitconn (5);
  15. Pool.setminconn (5);
  16. Pool.setmaxconn (250);
  17. Pool.setmaxidle (1000 * 60 * 60 * 6);
  18. Set the sleep time of the main thread
  19. Pool.setmaintsleep (30);
  20. The rule of TCP is that the local machine waits for a remote host before sending a packet
  21. The acknowledgement of the last packet sent arrives; This method can close the cache of the socket,
  22. So that the bag is ready for the hair;
  23. Setting TCP parameters, connection timeouts, etc.
  24. Pool.setnagle (FALSE);
  25. Pool.setsocketto (3000);
  26. Pool.setsocketconnectto (0);
  27. Initializing the connection pool
  28. Pool.initialize ();
  29. Compression setting, data that exceeds the specified size (in K) is compressed
  30. Mcc.setcompressenable (TRUE);
  31. Mcc.setcompressthreshold (64 * 1024);
  32. }
  33. public static void Bulidcache () {
  34. Set (key,value,date), date is an expiration time, and if you want this expiration to take effect, the new date passed here (long
  35. Date) in the parameter date, a value greater than or equal to 1000 is required.
  36. Because the Java client implementation in the source code is so implemented Expiry.gettime ()/1000, that is, if
  37. Values less than 1000, divided by 1000, are 0, which never expires
  38. Mcc.set ("Test", "This is a test String", New Date (10000));
  39. Mcc.set ("Test", "This is a test String111", new Date (10000));//after 10 seconds expires
  40. User U = new user ();
  41. U.setusername ("AAAA");
  42. Mcc.set ("User", u);
  43. User U1 = (user) mcc.get ("User");
  44. System.out.println (U1.getusername ());
  45. Add to save value when this key does not exist
  46. Replace value when key is the same
  47. Set writes the new value directly, if the key exists it replaces the value
  48. }
  49. public static void output () {
  50. Take value from cache
  51. String value = (string) mcc.get ("test");
  52. System.out.println (value);
  53. }
  54. public static void Main (string[] args) {
  55. Bulidcache ();
  56. Output ();
  57. }
  58. }

The output is:

Aaaa
This is a test String111

Reprinted from http://zy116494718.iteye.com/blog/1664190

Membercache and other Cache related knowledge:

1) JBOSS cache, EHCACHE, and other Java-written caches, generally with the main program on the same machine, memory direct access, faster than memcached. However, if you have more than one server, each server has its own cache, with some notification or replicate mechanisms to synchronize. Also, the cache size is subject to the heap size setting. Memcached runs on a machine other than the main program, passing data through network access. Because it is a dedicated server, size is not restricted. If the cache data is not small (300MB or less) ), without the use of memcached.

2) memcached composition of its n machine has a down, and will not be the whole hanging off, only access to certain cache content can not hit, even if all hung up, the cache is to reduce access to the database, so nothing is more pressure on the database, if you want the cache to persist, Or with a failover function, you can use Memcachedb.

3) Cache server has higher reliability than DB server

4) memcached requires the set object to be a serializable object, and the Java Obect cache, such as JBoss cache, does not have this idea, which is essentially different, but he can be used on the network, so it must be serialized and understandable

5) memcached Concurrent connection can go up to 1w, my application is often kept in 3-5k; it's fast not only because of the use of libevent, but also because it takes the "memory redundancy for access speed" memory management strategy, the article on the Internet specifically analyzes its memory allocation management of the source code, Speak very clearly, in this above JBoss cache, Ehcache, Oscache and it can not compare, memcached cluster is very good, heard that there are 200+ memcached cluster abroad, we also have this aspect of the attempt, the effect is very good, A down will not cause the other machine down, but the data is lost, need to slowly accumulate back, and support multi-client, Java, PHP, Python, Ruby can share data, it is used as a database.
My advice is: Your application access volume is large, the response speed is very high, the data consistency requirements in general, with it, block in front of the database, very cool (memcached is the internet company developed, just meet these three conditions); If the app is not busy, use Ehcache.

6) Ehcache, Oscache data are on the local server, access is only the system bus. And memcached to go is the network. About the transfer speed our app reads memcached network traffic is 2MB per second, which is about 5 times times the amount of network traffic that reads the database. But you need to know that now random PC is Gigabit network card, so memcached get/set operation Delay is very little, not much slower than Echache get/set. In a complete Web application, my stress test showed that the performance difference <= 5%

7) forums, SNS and other applications, will use a variety of technologies to cache.
Take Sohu BBS to say, PV is 5000w, peak 8000w
Which posts, comments read and write frequently, other parts read frequently.
Post list, comment list using C development (where the sorting algorithm is ingenious), socket call
Post, comment content using squid cache

Other frequently read sections use memcached, squid, and timed static pages to generate a variety of techniques.
Database with MySQL, sub-table.
Personally, the JVM-level cache can be used in small-scale applications, and memcached is not available
Large-scale Web site application, is certainly the system of horizontal segmentation, a variety of cache combination.

8) memcached very fast, but I did not say faster than the ehcache of this machine, but with the local cache has two points of discomfort:
1. Is the cache placed in memory or on disk? Applying a restart causes the memory cache to be lost and not fast enough on the disk. How much memory is appropriate? If it is a large-volume application, the cache object set-out may cause the server load to fluctuate sharply (we have suffered a loss and now open Ehcache, but the upper limit of the cache object is very small)
2. How are cached objects shared in the cluster application? JBoss Cache uses a broadcast, when the traffic is large, you can drag your service to death, this we have eaten a loss
Of course, someone talked about the Sohu example, once the traffic is large, all kinds of caches are used, currently we are squid + memcached + ehcache. Squid is also a good thing, but the cache content deletion is not very flexible, more suitable for web1.0, such as Sina Sohu News
In addition, the Java memcached client with 1.6 good, no need to upgrade to 2.1,2.x seems to be not stable.

Membercached study on "Turn"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.