Memcached is a general-purpose distributed memory caching system developed by Danga Interactive and using BSD licensing
The purpose of Danga Interactive development memcached is to create a memory cache system to handle the huge traffic on its web site livejournal.com. More than 20 million page visits a day put a lot of pressure on LiveJournal's database, so Danga Brad Fitzpatrick started designing memcached. Memcached not only reduces the load on the website database, but also becomes the cache solution used by most high-traffic websites in the world today.
This article begins with a comprehensive overview of memcached and then guides you through installing memcached and building it in your development environment. I'll also introduce the memcached client commands (9 in total) and show how to use them in standard and advanced memcached operations. Finally, I'll provide some tips for using the memcached command to measure the performance and efficiency of caching.
How to integrate memcached into your environment.
Before you start installing and using using memcached, we need to understand how to integrate memcached into your environment. Although memcached can be used anywhere, I find that memcached often works best when I need to perform several recurring queries in the database tier. I often set up a series of memcached instances between the database and the application server and use a simple pattern to read and write to these servers. Figure 1 can help you understand how to set up your application architecture:
Figure 1. Sample application architecture using memcached
Architecture is fairly easy to understand. I built a WEB layer that included some Apache instances. The next layer is the application itself. This layer typically runs on Apache Tomcat or other open source application servers. The next layer is where the memcached instance is configured-between the application server and the database server. When using this configuration, a slightly different way is needed to perform read and write operations on the database.
Read
The order in which I perform the read operation is to get the request from the WEB layer (a database query needs to be executed once) and to check the results of the query previously stored in the cache. If I find the value I want, I return it. If not found, executes the query and stores the results in the cache before returning the results to the WEB layer.
Write
When you write data to a database, you first need to perform a database write operation, and then set any previously cached results that are affected by this write operation to be invalid. This process helps prevent data inconsistencies between the cache and the database.
Install memcached
Memcached supports some operating systems, including linux®, Windows®, Mac OS, and Solaris. In this article, I'll explain in detail how to build and install memcached from a source file. The main reason for this is that I can view the source code when I encounter a problem.
Libevent
Libevent is the only prerequisite for installing memcached. It is the asynchronous event notification library that memcached relies on. You can find the source file on the monkey.org on the libevent. Next, locate the source file for its latest version. For this article, we use a stable version of 1.4.11. After you get the archive, unzip it to a convenient location, and then execute the command in Listing 1:
Listing 1. Build and install Libevent
CD libevent-1.4.11-stable/
./configure make make
install
|
Memcached
Obtain the memcached source file from Danga Interactive, and still select the latest distribution. At the time of writing this article, the latest version is 1.4.0. Extract the tar.gz to a convenient location and execute the commands in Listing 2:
Listing 2. Build and install memcached
CD memcached-1.4.0/
./configure make make
install
|
After you complete these steps, you should have a memcached working copy installed, and you can use it. Let's make a brief introduction and then use it.
Using memcached
To start using memcached, you first need to start the memcached server and then connect to it using the Telnet client.
To start memcached, execute the command in Listing 3:
Listing 3. Start memcached
./memcached-d-M 2048-l 10.0.0.40-p 11211
|
This starts memcached (-D) in the form of a daemon, assigns it 2GB memory (-M 2048), and specifies the listener localhost, Port 11211. You can modify these values as needed, but the above settings are sufficient to complete the exercises in this article. Next, you need to connect to memcached. You will connect to the memcached server using a simple Telnet client.
Most operating systems provide a built-in Telnet client, but if you are using a windows-based operating system, you need to download a third-party client. I recommend using PuTTy.
After you install the Telnet client, execute the command in Listing 4:
listing 4. Connecting to memcached
If everything works, you should get a telnet response that instructs Connected to localhost (already connected to localhost) . If you do not get this response, you should return to the previous steps and ensure that the source files for both libevent and memcached have been successfully generated.
You are now logged on to the memcached server. Thereafter, you will be able to communicate with memcached through a series of simple commands. 9 memcached Client commands can be grouped into three categories: basic advanced Management
Basic memcached Client Command
You will use the five basic memcached commands to perform the simplest operation. These commands and actions include: Set add replace get delete
The first three commands are standard modification commands for manipulating key-value pairs stored in memcached. They are all very easy to use, and each uses the syntax shown in Listing 5:
Listing 5. Modify command Syntax
Command <key> <flags> <expiration time> <bytes>
<value>
|
Table 1 defines the parameters and usages of the memcached Modify command.
Table 1. memcached Modify Command Parameters
Parameters |
usage |
Key |
Key is used to find cached values |
Flags |
You can include integer parameters for key-value pairs that the client uses to store additional information about key-value pairs |
Expiration time |
The length of time the key value pair is saved in the cache (in seconds, 0 for eternity) |
bytes |
Byte points stored in the cache |
Value |
Stored value (always in the second row) |
Now, let's take a look at the actual use of these commands.
Set
The set command is used to add a new key-value pair to the cache. If the key already exists, the previous value will be replaced.
Note the following interactions, which use the SET command:
Set userId 0 0 5
12345
STORED
|
If a key-value pair is correctly set using the SET command, the server responds with the word STORED . This example adds a key-value pair to the cache whose key is userId and has a value of 12345. And the expiration time is set to 0, which notifies memcached that you want to store this value in the cache until you delete it.
Add
The add command adds a key-value pair to the cache only if the key does not exist in the cache. If the key already exists in the cache, the previous value will remain the same, and you will get a response not_stored .
The following are standard interactions using the Add command:
Set userId 0 0 5
12345
STORED
add userId 0 0 5
55555
not_stored
add companyid 0 0 3
564
STORED
|
Replace
The Replace command replaces the key in the cache only if the key already exists. If the key does not exist in the cache, you will receive a not_stored response from the memcached server.
The following are standard interactions using the Replace command:
Replace AccountId 0 0 5
67890
not_stored
set accountid 0 0 5
67890
STORED
replace AccountId 0 0 5< c7/>55555
STORED
|
The last two basic commands are get and delete. These commands are fairly easy to understand and use a similar syntax, as follows:
Next look at the application of these commands.
Get
The GET command retrieves the value associated with the previously added key value pair. You will use get to perform most of the retrieval operations.
The following are typical interactions using the GET command:
Set userId 0 0 5
12345
STORED get
userId
VALUE userId 0 5
12345
End Get Bob End
|
As you can see, the get command is fairly straightforward. You use a key to call get, and if the key exists in the cache, the corresponding value is returned. If it does not exist, no content is returned.
Delete
The last basic command is delete. The delete command is used to delete any existing values in the memcached. You will use a key to call Delete, and if the key exists in the cache, delete the value. If it does not exist, a not_found message is returned.
The following is a client server interaction that uses the Delete command:
Set userId 0 0 5
98765
STORED
Delete Bob
not_found
Delete userId
DELETED get
userId
End
|
Advanced memcached Client Command
The two advanced commands that can be used in memcached are gets and CAs. Gets and CAS commands need to be used in combination. You will use these two commands to make sure that the existing name/value pair is not set to the new value, if the value has been updated. Let's take a look at these orders separately.
gets
The gets command functions like a basic get command. The difference between two commands is that the gets returns slightly more information: the 64-bit integer value is very much like the version identifier of the name/value pair.
The following is a client server interaction using the gets command:
Set userId 0 0 5
12345
STORED get
userId
VALUE userId 0 5
12345
end
gets
userId c9>
12345
End
|
Consider the difference between get and gets commands. The gets command returns an extra value-in this case, the integer value 4, which identifies the name/value pair. If another set command is executed on this name/value pair, the extra value returned by gets will change to indicate that the name/value pair has been updated. Listing 6 shows an example:
listing 6. Set Update Version indicator
Set userId 0 0 5
33333
STORED
gets userId
5
33333
End
|
Did you see the value returned by gets? It has been updated to 5. Each time you modify a name/value pair, the value changes.
CAs
CAS (check and set) is a handy memcached command to set the value of a name/value pair (if the name/value pair has not been updated since you last performed gets). It uses syntax similar to the SET command, but includes an extra value: The extra value returned by the gets.
Note the following interactions using the CAS command:
Set userId 0 0 5
55555
STORED
gets userId
6
55555
end
6
33333
STORED
|
As you can see, I invoke the gets command with an extra integer value 6来, and the operation runs in a very sequential order. Now, let's take a look at the series of commands in Listing 7:
listing 7. CAS command with older version indicators
Set userId 0 0 5
55555
STORED
gets userId
8
55555
end
6
33333
EXISTS
|
Note that I did not use the integer value that gets recently returned, and the CAS command returned the EXISTS value as a failure. In essence, using both gets and CAS commands prevents you from using name/value pairs that have been updated since the last read.
Cache Management Commands
The last two memcached commands are used to monitor and clean the memcached instance. They are stats and Flush_all commands.
Stats
The stats command functions as its name: the current statistics for the memcached instance to which it is connected. In the following example, the Execute Stats command displays information about the current memcached instance:
Stats
STAT pid
STAT uptime 101758 STAT time
1248643186
STAT version 1.4.11
STAT pointer_size 32< C6/>stat rusage_user 1.177192
STAT rusage_system 2.365370 STAT
curr_items 2
STAT total_items 8
STAT Bytes 119
STAT curr_connections 6
STAT total_connections 7
STAT connection_structures 7
STAT cmd_ Get
STAT cmd_set
STAT get_hits
STAT get_misses 0
STAT Evictions 0
STAT bytes_read 471
stat
4 End
|
Most of the output here is very easy to understand. When I discuss caching performance later, I'll also explain in detail what these values mean. For now, let's look at the output and then use the new keys to run some set commands and run the stats command again to notice what has changed.
Flush_all
Flush_all is the last command to introduce. This simplest command is used only to clean up all name/value pairs in the cache. If you need to reset the cache to a clean state, Flush_all can provide a lot of use. Here is an example of using Flush_all:
Set userId 0 0 5
55555
STORED get
userId
VALUE userId 0 5
55555
end
flush_all
OK
Get UserId End
|
Caching performance
At the end of this article, I'll discuss how to use the Advanced memcached command to determine the performance of the cache. The stats command is used to tune the use of the cache. The two most important statistics that need to be noted are et_hits and get_misses. The two values indicate the number of times a name/value pair was found (get_hits) and the number of times a name/value pair was not found (get_misses).
Combining these values, we can determine how the cache is utilized. When you first start the cache, you can see that the get_misses will naturally increase, but after a certain amount of usage, these get_misses values should gradually stabilize-this means that caching is primarily used for common read operations. If you see get_misses continue to grow fast and get_hits gradually stabilize, you need to determine what the cached content is. You may have cached the wrong content.
Another way to determine cache efficiency is to view the cache hit ratio (hit Ratio). The cache Hit ratio represents the percentage of times that a get was executed and the number of times it was missed. To determine this percentage, you need to run the stats command again, as shown in Listing 8:
listing 8. Calculating Cache Hit Ratio
Stats
STAT pid 6825
STAT uptime 540692 STAT time
1249252262
STAT version 1.2.6
STAT pointer_size 32< C7/>stat rusage_user 0.056003
STAT rusage_system 0.180011
STAT curr_items 595 STAT total_items 961 STAT bytes 4587415
STAT curr_connections 3
STAT total_connections STAT
connection_structures 4
STAT cmd_get 2688 STAT cmd_set 961 STAT get_hits 1908 STAT get_misses 780 STAT evictions
0
STAT Bytes_read 5770762 STAT bytes_written 7421373 STAT limit_maxbytes 536870912 STAT Threads
1
End
|
Now, divide the value of the get_hits by the cmd_gets. In this case, your hit rate is about 71%. Ideally, you might want to get a higher percentage-the higher the ratio, the better. Viewing statistics and measuring them from time to time can well determine the efficiency of caching strategies.
Conclusion
Caching is an integral part of any mass Web application. I have successfully used it several times myself. If you choose to use memcached as a caching solution, then I'm sure you can see how efficient it is.
In part 2nd of this series, you will learn how to integrate memcached into a Grails application. We will take this opportunity to discuss an exciting stack for scalable WEB application development and to apply some of the best techniques. So far, the knowledge presented in this article is enough to help you start mastering memcached. I encourage you to install your own memcached instance and start trying to use it.