Use memcached to improve site performance-Reduce read from databases and data sources

Source: Internet
Author: User
Tags ibm developerworks rebol
Martin Brown, freelance writer and freelance Martin Brown has been a professional writer for more than seven years. He is the author of a wide range of books and articles. He specializes in a variety of development languages and platforms-perl, Python, Java, JavaScript, basic, Pascal, Modula-2, C, C ++, REBOL, gawk, shellscript, windows, Solaris linux, BEOs, Mac OS/X, etc, it also involves web programming, system management, and integration. Martin is a Microsoft theme expert (SME) and regular contributor to serverwatch.com, linuxtoday.com, and IBM developerworks. He is also a formal blog for Computerworld, the apple blog, and other sites. You can contact him through his Web site: http://www.mcslp.com.

Introduction:Open SourceMemcachedA tool is a cache used to store common information. With it, you do not need to load (and process) Information from slow resources, such as disks or databases. This tool can be deployed in a dedicated environment or used as a way to use up excess memory in an existing environment. Although memcached is simple, it is sometimes used improperly or in the wrong environment type. This article describes the best time to use memcached.

Introduction

Memcached is often used to accelerate application processing. Here, we will focus on introducing best practices for deploying it in applications and environments. This includes what should be stored or should not be stored, how to handle the flexible distribution of data, and how to adjust the methods used to update memcached and stored data. We will also introduce support for high-availability solutions, such as IBM WebSphere extreme scale.

All applications, especially many web applications, need to optimize their access to the client and return information to the client. However, normally, the same information is returned. Loading data from a data source (database or file system) is very inefficient. It is especially inefficient to run the same query every time you want to access this information.

Although many web servers can be configured to use cache to send back information, it is not compatible with the dynamic characteristics of most applications. This is the application of memcached. It provides a general memory to store everything, including objects in local languages, this allows you to store a variety of information and access it from a variety of applications and environments.

Basic knowledge

Memcached is an open-source project designed to use redundant RAM in multiple servers to act as a memory cache that can store frequently accessed information. The key here is to use the termCache: Memcached provides temporary memory storage for information loaded from other locations.

For example, consider such a typical web-based application. Even a dynamic website may have components or information constants throughout the entire lifecycle of the page. In a blog site, the category list for a single blog post is unlikely to change frequently on the page. Loading this information through a query of the database is relatively expensive each time, especially when the data is not changed. Figure 1 shows the page partitions that can be cached on a blog site.

Figure 1. cache elements on a typical blog page

Place this structure on other elements of the blog site, poster information, comment-set the blog post itself-for inference, it can be seen that 10-20 database queries and formatting may be required to display the content of the home page. This process is repeated every day for hundreds or even thousands of pages, so the queries executed by your servers and applications are much more than the ones required to display the page content.

By using memcached, you can store the formatted Information loaded from the database as a format that can be directly used on Web pages. In addition, because the information is loaded from the disk rather than from the database and other processing, the access to the information is almost instantaneous.

Memcached is a cache used to store common information. With it, you do not need to load and process information from slow resources, such as disks or databases.

Memcached interfaces are provided through network connections. This means that you can share a single memcached server (or multiple servers, as shown later in this article) between multiple clients ). This network interface is very fast, and to improve performance, the server intentionally does not support authentication or secure communication. However, this should not restrict deployment options. The memcached server should exist on your networkInternal. The practicality of network interfaces and the simplicity of deploying multiple memcached instances allow you to useRedundantRam to increase the overall cache size.

The memcached storage method is a simple key/value pair, similar to hash or associated arrays in many languages. Store the information to memcached by providing the key and value, and restore the information by pressing the specific key request information.

Information is stored in the cache indefinitely, unless the following situations occur:

  1. The memory allocated to the cache is exhausted.-In this case, memcached uses the LRU (least recently used) method to delete entries from the cache. Entries that have not been used recently will be deleted first from the cache and accessed first from the oldest cache.
  2. The entry is explicitly deleted.-You can always delete entries from the cache.
  3. The entry expires.-Each entry has a validity period so that information stored on this key can be cleared from the cache when it is too old.

These conditions can be combined with the logic of your application to ensure that the information in the cache is up-to-date. With this basic knowledge, let's look at how memcached can be used best in applications.

When to use memcached

When memcached is used to improve the application's procedural performance, you can modify some key processes and steps.

When loading information, typical scenario 2 is shown.

Figure 2. Load the typical order of information to be displayed

Generally, these steps are:

  1. Execute one or more queries to load information from the database
  2. Formatting information suitable for display (or further processing)
  3. Use or display formatted data

When using memcached, You Can slightly modify the logic of the application in combination with this cache:

  • Load information from the cache whenever possible
  • If yes, the cached version of the information is used.
  • If it does not exist:
    1. Execute one or more queries to load information from the database
    2. Formatting information suitable for display or further processing
    3. Store information in the cache
    4. Use formatted data

Figure 3 summarizes these steps.

Figure 3. Load information suitable for display when using memcached

Data loading is a process of up to three steps. Data is loaded from the cache or from the database (depending on the situation) and stored in the cache.

When this process occurs for the first time, data is normally loaded from the database or other data sources and then stored in memcached. The current access to this information will be pulled out from memcached instead of loaded from the database, saving time and CPU loops.

Another aspect of the problem is to ensure that if you change the information to be stored in memcached, you must update the version of memcached while updating the backend information. This slightly changes the typical sequence shown in figure 4, as shown in Figure 5.

Figure 4. Update or store data in a typical application

Figure 5 shows the process after memcached is used.

Figure 5. Update or store data when using memcached

For example, if you still use a blog site as an example, when the blog system updates the category list in the database, the update should follow the following sequence:

  1. Update the category list in the database
  2. Format information
  3. Store information in memcached
  4. Return information to the client

Storage Operations in memcached are atomic, so information updates do not allow clients to only obtain part of the data; they obtain old or new versions.

For most applications, these two operations are the only thing you need to pay attention. When you access the data used by others, it is automatically added to the cache. If the data is changed, it is automatically updated in the cache.

Key, namespace, and value

Another important factor to consider for memcached is how to organize and name the data stored in the cache. From the previous blog site examples, it is not difficult to see that a consistent naming structure is required so that you can load the blog category, history, and other information, and then load the information (and update the cache) or when updating data (also updating cache.

The specific naming system used is specific to the application. However, a structure similar to an existing application can usually be used, and this structure may be based on a unique identifier. This happens when information is pulled from the database or when information is sorted.

Take blog post as an example.category-list. A single post corresponding to this post ID, suchblogpost-29Related values can be used, and the annotation of this item can be stored inblogcomments-29, Where29Is the ID of this blog post. In this way, you can store a variety of information in the cache and use different prefixes to identify the information.

The simplicity of memcached key/value storage (and lack of security) means that if you want to support multiple applications while using the same memcached server, you can consider using quantifiers of other formats to identify data that belongs to a specific application. For example, you can add an imageblogapp:blogpost-29Such application prefix. These keys are not formatted, so any string can be used as the key name.

Make sure that the information stored in the cache is suitable for your application in terms of the stored value. For example, for this blog system, you may want to store the objects used by blog applications to format blog information, rather than the original HTML. This is more practical if the same infrastructure is used in multiple places within the application.

Interfaces in most languages, including Java, Perl, and PHP, can be serialized to store language objects in memcached. This allows you to store and then restore all objects from the memory storage, instead of manually refactoring them in your application. Many objects, or their structures, are based on a hash or array structure. For cross-language environments, such as sharing the same information between JSP and JavaScript environments, you can use a schema-neutral format, such as JavaScript Object Notation (JSON) or XML.

Fill and use memcached

As an open-source product and a product originally developed to work in an existing open-source environment, memcached is supported by a large number of environments and platforms. There are many interfaces to communicate with the memcached server, and there are often multiple implementations for all languages. See references for frequently used libraries and toolboxes.

Listing all supported interfaces and environments is unlikely, but they all support the basic APIs provided by the memcached protocol. These descriptions have been simplified and applied in different language contexts. In these languages, different values can be used to indicate errors. The main functions are:

  • get(key)-Obtain information from memcached that stores a specific key. If the key does not exist, an error is returned.
  • set(key, value [, expiry])-Use the identifier key in the cache to store this specific value. If the key already exists, it will be updated. The expiration time is measured in seconds. If the value is less than 30 days (30*24*60*60), it is used as the relative time. If the value is greater than 30 days, it is used as the absolute time (EPOCH ).
  • add(key, value [, expiry])-If the key does not exist, add it to the cache. If the key already exists, an error is returned. This function is useful if you want to explicitly Add a new key without updating it because it already exists.
  • replace(key, value [, expiry])-Update the value of the specified key. If the key does not exist, an error is returned.
  • delete(key [, time])-Delete this key/value pair from the cache. If you provide a time, adding a new value with this key will be blocked for this specific period. Timeout allows you to ensure that this value can always be reread from your data center.
  • incr(key [, value])-Add 1 or a specific value to a specific key. Only applicable to numeric values.
  • decr(key [, value])-Minus 1 or a specific value for a specific key, only applicable to numerical values.
  • flush_all-Invalidate (or expire) all current entries in the cache ).

For example, in Perl, basic set operations can be processed as shown in Listing 1.

Listing 1. Basic Set Operations in Perl


use Cache::Memcached;

my $cache = new Cache::Memcached {
'servers' => [
'localhost:11211',
],
};

$cache->set('mykey', 'myvalue');

 

The same basic operations in ruby are shown in Listing 2.

Listing 2. Basic Set Operations in ruby


require 'memcache'
memc = MemCache::new '192.168.0.100:11211'

memc["mykey"] = "myvalue"

 

In the two examples, we can see the same basic structure: Set the memcached server and assign or set the value. Other interfaces are also available, including those suitable for Java technology, so that you can use memcached in WebSphere applications. The memcached interface class allows Java objects to be serialized directly to memcached to facilitate the storage and loading of complex structures. When deploying in an environment like websphere, two things are very important: Service elasticity (how to do this when memcached is unavailable) and how to increase the cache storage to improve the performance when using multiple application servers or when using environments like WebSphere extreme scale. Let's take a look at these two questions.

Elasticity and availability

One of the most common problems with memcached is: "What happens if the cache is unavailable ?" As stated in the previous chapter, the information in the cache should not be the only resource of the information. Data stored in the cache must be loaded from other locations.

Although the failure to access information from the cache slows down the performance of the application, it should not stop the application from running. The following scenarios may occur:

  1. If the memcached service is down, the application should roll back to load information from the original data source and format the information to display. This application should continue to try to load and store information in memcached.
  2. Once the memcached server becomes available, the application should automatically try to store data. There is no need to forcibly reload cached data. You can use standard access to load and populate the cache with information. In the end, the cache will be refilled by the most commonly used data.

I reiterate that memcached is the cache of information but not the only data source. The unavailability of the memcached server should not end the application, although this means that the performance will decrease before the memcached server returns to normal. In fact, the memcached server is relatively simple, and although it is not absolutely fault-free, the result of its simplicity is that it rarely fails.

Allocate Cache

The memcached server is only a cache for some key storage values on the network. If there are multiple machines, you naturally want to set up a memcached instance on all the extra machines to provide a large online RAM cache storage.

With this idea, we also assume that we need to use some allocation or replication mechanism to copy key/value pairs between machines. The problem with this method is that, by doing so, the available RAM cache will be reduced rather than increased. As shown in figure 6, there are three application servers, each of which can access a memcached instance.

Figure 6. incorrect use of multiple memcached instances

Although each memcached instance is 1 GB (3 gb ram cache is generated ), however, if each application server only has its own cache (or there is data replication between memcached), then the entire installation can only have 1 GB of cache to be copied between each instance.

Because memcached provides information through a network interface, a single client can access data from any memcached instance that it can access. If data is not replicated across instances, a 3 gb ram cache is available on each application server, as shown in figure 7.

Figure 7. correct use of multiple memcached instances

The problem with this method is to select the server to store the key/value pair, and how to decide which memcached server to talk to when you want to obtain a value again. The solution to the problem is to ignore complicated things, such as searching for tables or sending a message to the memcached server to handle the process for you. The memcached client must be simple.

The memcached client does not need to determine this information. It only needs to use a simple hash algorithm for the key specified during information storage. To store or retrieve information from a column of memcached servers, the memcached client uses a consistent hash algorithm to obtain a value from this key. For example, the keymykeyConverted to a value23875. It does not matter whether to save or retrieve information. This key will always be used as a unique identifier to load data from the memcached server. Therefore, in this example, the value corresponding to the "mykey" hash conversion is always23875.

If there are two servers, the memcached client will perform a simple operation on this value (for example, coefficient) to determine whether it should store this value on the first or second configured memcached instance.

When a value is stored, the customer can determine from this key the hash value and the server on which it was originally stored. When a value is obtained, the customer obtains the same hash value from this key and selects the same server to obtain information.

If the same server list (in the same order) is used on each application server, when you need to save or retrieve the same key, each application server selects the same server. Now, in this example, there is a 3 GB memcached space that can be shared, rather than the replication of the same 1 GB space, which brings more available cache, it is likely to improve the performance of applications with multiple users.

This process also has its complexity (for example, when a server is unavailable). For more information, see the relevant documentation (see references ).

How can I disable memcached?

Although memcached is simple, memcached instances are still incorrectly used.

Memcached is not a database

The most common misuse of memcached is to use it as a data storage instead of a cache. The primary purpose of memcached is to speed up data response time. Otherwise, it takes a long time to build or restore data from other data sources. A typical example is to restore information from a database, especially when the information needs to be formatted or processed before it is displayed to the user. Memcached is designed to store information in the memory to avoid repeated execution of the same task each time data needs to be restored.

Do not use memcached as the only source of information required to run the application. Data should always be available from other sources of information. In addition, remember that memcached is only a key/value storage. You cannot query data or iterate the content to extract information. It should be used to store data blocks or objects for batch use.

Do not cache database rows or files

Although memcached can be used to store data rows loaded from the database, it is actually a query cache, and most databases provide their own query cache mechanism. Other objects, such as images or files in a file system, are the same. Many applications and web servers have some good solutions for such work.

If you use it to store all information blocks after loading and formatting, you can get more practical tools and performance improvements from memcached. Taking our blog site as an example, the best way to store information is to format the blog category as an object, or even after formatting it into HTML. The construction of a blog page can be achieved by loading various components from memcached (such as blog post, category list, post history, etc.) and writing the completed HTML back to the client.

Memcached is not safe

To ensure optimal performance, memcached does not provide any form of security, neither authentication nor encryption. This means that the access to the memcached server should be handled as follows: first, by placing them on the same private side of the Application Deployment environment, and second, if security is required, then, UNIX socket is used and only applications on the current host can access the memcached server.

This sacrifices some flexibility and elasticity, as well as the ability of multiple machines across the network to share the RAM cache, but this is the only solution to ensure the security of memcached data in the current situation.

Do not limit yourself

Apart from using memcached instances, the flexibility of memcached should not be ignored. Because memcached is in the same architecture as the application, it is easy to integrate and connect to it. It is not complicated to change the application so that memcached can be used. In addition, because memcached is only a cache, it does not stop application execution when a problem occurs. If it is used correctly, what it does is to reduce the load on the infrastructure of other servers (reduce read operations on databases and data sources ), this means that more clients are supported without more hardware.

But remember, it is just a cache!

Conclusion

In this article, we learned about memcached and how to best use it. We can see how information is stored, how to choose a proper key, and how to choose the information to be stored. We also discussed some key deployment problems for all memcached users, including the use of multiple servers, how to do this when the memcached instance dies, and (maybe the most important) in which cases does memcached not work.

As an open-source application with simple and straightforward purposes, memcached comes in its functionality and practicality. By providing a huge Ram bucket for information, making it available on the network, and then making it accessible through a variety of interfaces and languages, memcached can be integrated into a variety of installations and environments.

 

References

Learning

  • The MySQL memcached document provides a lot of information about how to use memcached in a typical database deployment environment.
  • Learn about IBM's commercial cache solution solidDB through the IBM solidDB product family.
  • Stay tuned to developerworks technical events and network broadcasts.
  • Check out recent seminars, trade exhibitions, network broadcasts, and other activities for IBM open source code developers to be held globally.
  • Visit the developerworks open source area to get a wealth of how-to information, tools and project updates, and the most popular articles and tutorials to help you develop with open source technology, they are used in combination with IBM products.
  • View the free developerworks on demand demo and learn about IBM and open-source technologies and product features.

Obtain products and technologies

  • Memcached.org provides information about memcached and how to download and install it.
  • Cache: memcached interface for Perl provides a wide range of interfaces.
  • For Java technology, you can use the com. danga. memcached class, which provides some extra failover and multi-instance extension.
  • Use the IBM product evaluation trial software to improve your next development project, which can be downloaded.
  • Download the IBM product evaluation trial software or ibm soa sandbox for people and start using application development tools and middleware Products from DB2, Lotus, rational, Tivoli and websphere.

Discussion

  • Join the developerworks blog to join the developerworks community.
  • Welcome to my developerworks Chinese community.
About the author

Martin Brown has been a professional writer for more than seven years. He is the author of a wide range of books and articles. He specializes in a variety of development languages and platforms-perl, Python, Java, JavaScript, basic, Pascal, Modula-2, C, C ++, REBOL, gawk, shellscript, windows, Solaris linux, BEOs, Mac OS/X, etc, it also involves web programming, system management, and integration. Martin is a Microsoft theme expert (SME) and regular contributor to serverwatch.com, linuxtoday.com, and IBM developerworks. He is also a formal blog for Computerworld, the apple blog, and other sites. You can contact him through his Web site: http://www.mcslp.com.

From: http://www.ibm.com/developerworks/cn/opensource/os-memcached

This article is the use of b3log solo from the simple design of the art of the original article: http://88250.b3log.org/articles/2010/12/21/1292901251292.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.