Apply memcached to improve site performance--reduce read from databases and data sources

Source: Internet
Author: User
Tags memcached ibm developerworks rebol value store
Martin Brown, freelance writer, freelance Developer Martin Brown has been a professional author for more than seven years. He is the author of numerous books and articles with a wide range of subjects. His expertise involves a variety of development languages, Peace console--perl, Python, Java™, JavaScript, Basic, Pascal, Modula-2, C, C + +, Rebol, gawk, Shellscript, Windows, Solaris, linux®, BeOS, Mac os/x, and so on, also involve WEB programming, system management, and integration. Martin is Microsoft® 's subject matter expert (SME) and a regular contributor to Serverwatch.com, linuxtoday.com, and IBM DeveloperWorks, who is also Computerworld, the An official blog from Apple blogs and other sites. You can contact him through his Web site: http://www.mcslp.com.


Introduction: The Open source Memcached tool is a cache for storing commonly used information, and with it you do not have to load (and process) information from slow resources, such as disks or databases. The tool can be deployed in a dedicated case or as a way to run out of excess memory within an existing environment. Although memcached is simple, it is sometimes misused or used in the wrong environment type. In this article, learn about the best time to use memcached. Brief Introduction



Memcached is often used to speed up application processing, where we will focus on best practices for deploying it in applications and environments. This includes what should be stored or should not be stored, how to handle the flexible distribution of the data, and how to adjust the methods used to update memcached and stored data. We will also introduce support for high-availability solutions, such as IBM Websphere®extreme Scale.



All applications, especially many Web applications, need to optimize the speed with which they access the client and return information to the client. Usually, however, the same information is returned. Loading data from a data source (database or file system) is inefficient, especially if you run the same query every time you want to access the information.



Although many Web servers can be configured to use caching to send back information, that is not compatible with the dynamic nature of most applications. And that's where memcached is. It provides a common memory storage that preserves anything, including native language objects, which allows you to store a wide variety of information and access it from many applications and environments. Basic Knowledge



Memcached is an open source project designed to use excess RAM from multiple servers to act as a memory cache that can store frequently accessed information. The key here is to use the term cache: Memcached provides temporary storage in memory for information that is loaded from elsewhere.



Consider, for example, a typical web-based application. Even a dynamic web site may have some component or information constants that run through the entire lifecycle of the page. Within a blog site, the list of categories for a single blog post is unlikely to change frequently between page views. It is relatively expensive to load this information each time through a query to the database, especially if the data has not changed. From Figure 1 You can see a page partition that can be cached within a blog site.




Figure 1. A typical cached element within a blog page



By putting this structure on the other elements of the blog site, poster information, commenting-setting up the blog post itself, you can see that 10-20 of database queries and formatting are likely to occur in order to display the contents of the home page. Repeat this process for hundreds of or even thousands of of pages per day, and your servers and applications will run far more queries than you need to perform to display the content of the page.



By using memcached, you can store formatted information that is loaded from a database as a format that can be used directly on a Web page. And because information is loaded from RAM rather than through databases and other processing from disk, access to information is almost instantaneous.



Again, memcached is a cache for storing commonly used information, and with it you do not have to load and process information from slow resources, such as disks or databases.



The interface to memcached is provided over a network connection. This means that you can share a single memcached server (or multiple servers, as shown later in this article) among multiple clients. This network interface is very fast, and in order to improve performance, the server intentionally does not support authentication or secure communication. However, this should not limit deployment options. The memcached server should exist within your network. The practicality of the network interface and the ease with which multiple memcached instances can be deployed allow you to use excess RAM on multiple machines to increase the overall size of your cache.



The memcached storage method is a simple key/value pair, similar to a hash or associative array in many languages. The information is stored in memcached by providing keys and values to recover information by requesting information in a specific key.



Information is kept in the cache indefinitely unless the memory allocated for the cache is depleted -in this case, Memcached uses the LRU (least recently used) method to remove the entry from this cache. Items that have not been used recently are deleted from the cache first, the oldest first access. entries are explicitly deleted -entries can always be deleted from this cache. expiration of Entries -each entry has a valid expiration date so that the information stored for this key can be purged from the cache when it is too old.



These situations can be used in combination with the logic of your application to ensure that the information in the cache is up to date. With these basics in place, let's take a look at how best to use memcached within an application. when to use memcached



When you use memcached to improve application performance, you can modify some of the key processes and procedures.



When loading information, the typical scenario is shown in Figure 2.




Figure 2. Load the typical order of information to display



In general, the steps are to execute one or more queries to format information that is appropriate for display (or further processing) from the database load information to use or display formatted data



When using memcached, the logic of the application can be slightly modified to fit the cache: try to load information from the cache if it exists, use the cached version of the information if it does not exist: Execute one or more queries to format information that is appropriate for display or further processing from the database loading information Storing information in the cache using formatted data



Figure 3 is a summary of these steps.




Figure 3. Load the information appropriate for display when using memcached



Data loading becomes a process of up to three steps, loading data from the cache or loading data from the database (as appropriate) and storing it in the cache.



When this process first occurs, the data is normally loaded from a database or other data source and then stored in the memcached. The next time you access this information, it pulls out of the memcached instead of loading from the database, saving time and CPU cycles.



Another aspect of the problem is to make sure that if you change the information that you want to store in the memcached, update the memcached version while updating the backend information. This will take a slight change in the typical order shown in Figure 4, as shown in Figure 5.




Figure 4. Update or store data within a typical application



Figure 5 shows a process that has changed after using memcached.




Figure 5. Update or store data when using memcached



For example, still take the blog site as an example, when the blog system updates the list of categories in the database, the update should follow the following sequence: Update the list of categories in the database to store the information in memcached and return the information to the client



The storage operations within the memcached are atomic, so updates to the information do not allow the client to get only part of the data; they get either the old version or the new version.



For most applications, these two actions are the only things you need to be aware of. When you access data that is used by others, it is automatically added to the cache and automatically updated in this cache if you make changes to the data. keys, namespaces, and values



Memcached Another important factor to consider is how to organize and name the data stored in the cache. From the previous blog site example, it's easy to see that you need to use a consistent naming structure so that you can load blog categories, history, and other information, and then use it when you load information (and update the cache) or when you update the data (and also update the cache).



The specific naming system you use is specific to your application, but you can often use a structure similar to an existing application, and the structure is likely to be based on a unique identifier. This occurs when you pull information from a database or when you are organizing an information set.



For example, blog post, you can store a list of categories in an item with key category-list. A single post that corresponds to this post ID, such as blogpost-29 related values, can be used, and the annotation for that item can be stored in blogcomments-29, where 29 is the ID of the blog post. This allows you to store a wide variety of information in the cache, using a different prefix to identify the information.



The simplicity of the memcached key/value store (and the lack of security) means that if you want to support multiple applications while using the same memcached server, you might consider using quantifiers in other formats to identify the data as belonging to a particular application. For example, you can add an application prefix such as blogapp:blogpost-29. These keys are not formatted, so you can use any string as the name of the key.



In terms of storing values, you should ensure that the information stored in the cache is appropriate for your application. For example, for this blogging system, you might want to store objects used by your blog application to format your blog information, rather than the original HTML. This is more practical if the same infrastructure is used in multiple places within the application.



Most language interfaces, including Java™, Perl, PHP, and so on, can serialize language objects for storage in memcached. This allows you to store and then recover all objects from memory storage instead of manually refactoring them within your application. Many objects, or the structures they use, are based on some sort of hash or array structure. For cross-language environments, such as sharing the same information between a JSP environment and a JavaScript environment, you can use a schema-neutral format, such as JavaScript Object notation (JSON) or even XML. populating and using memcached



As an Open-source product and a product that was originally developed to work in an existing open source environment, memcached is supported by a wide range of environments and platforms. There are many interfaces that communicate with memcached servers, and often have multiple implementations for all languages. See resources for common libraries and toolkits.



It is not possible to list all supported interfaces and environments, but they all support the underlying APIs provided by the Memcached protocol. These descriptions have been simplified and applied within the context of different languages, where different values can be used to indicate an error. The main functions are: Get (key)- from a memcached that stores a particular key. If the key does not exist, an error is returned. Set (key, value [, expiry])-stores this particular value using the identity keys in the cache. If the key already exists, it will be updated. The unit of expiration is seconds, and if the value is less than 30 days (30*24*60*60), it is used as a relative time, and if the value is greater than 30 days, it is used as absolute time (epoch). Add (key, value [, expiry])-adds the key to the cache if the key does not exist, and returns an error if the key already exists. This function is useful if you want to explicitly add a new key without updating it because it already exists. Replace (key, value [, expiry])-Updates the value of this particular key and returns an error if the key does not exist. Delete (key [, TIME])-Deletes this key/value pair from the cache. If you provide a time, adding a new value with this key will block this particular period. Timeouts allow you to ensure that this value can always be reread from your datacenter. INCR (key [, value])-Increases 1 or a specific value for a specific key. Applies only to numeric values. DECR (key [, value])-minus 1 or a specific value for a specific key, only applies to numeric values. flush_all-to invalidate all current entries in the cache (or expire).



In Perl, for example, a basic set operation can be handled as shown in Listing 1.




Listing 1. Basic set operations within Perl


				
Use cache::memcached;

My $cache = new Cache::memcached {
    ' Servers ' => [
                   ' localhost:11211 ',
                   ],
    };

$cache->set (' MyKey ', ' myvalue ');





The same basic operations within Ruby are shown in Listing 2.




Listing 2. Basic set operations within Ruby


				
Require ' memcache '
MEMC = Memcache::new ' 192.168.0.100:11211 '

memc["MyKey"] = "myvalue"





In two examples, you can see the same basic structure: set up the memcached server, and then assign or set the value. Other interfaces are also available, including those that are appropriate for Java technology, allowing you to use memcached within a WebSphere application. The Memcached interface class allows Java objects to be serialized directly to memcached for easy storage and loading of complex structures. When deploying in an environment like WebSphere, there are two things that are important: the resiliency of the service (how memcached is not available) and how to increase the amount of cache storage to improve the use of multiple application servers or the use of like WebSphere EXtreme Scale The performance of such an environment. Let's take a look at these two questions next. elasticity and Availability



One of the most common questions about memcached is: "What happens if the cache is not available?" As stated in previous chapters, the information in the cache should not be the only resource for information. You must be able to load data stored in the cache from another location.



Although the inability to access information from the cache slows down the performance of the application, it should not prevent the application from running. Several scenarios may occur: If the memcached service is down, the application should fall back to the formatting required to load information from the original data source and display the information. This application should also continue to attempt to nega and store information in memcached. Once the memcached server is available, the application should automatically try to store the data. There is no need to force overloaded data that has been cached, and you can use standard access to load and populate the cache with information. Eventually, the cache will be populated with the most commonly used data.



Again, Memcached is a cached but not unique data source for information. memcached servers are not available and should not be the end of the application, although this means that performance will degrade before the memcached server resumes normal. In fact, the memcached server is relatively simple and, although not absolutely fail-safe, the result of its simplicity is that it rarely goes wrong. Allocating Cache



The memcached server is just a cache of values that are stored on some keys on the network. If you have more than one machine, you will naturally want to set an memcached instance on all redundant machines to provide an oversized networked RAM cache.



With this idea, there is also a need to use some sort of allocation or replication mechanism to copy key/value pairs between machines. The problem with this approach is that if you do this, you will reduce the amount of RAM cache available instead of increasing it. As shown in Figure 6, you can see that there are three application servers, and each server can access a memcached instance.




Figure 6. Incorrect use of multiple memcached instances



Although each memcached instance is 1 GB (generating 3 GB of RAM cache), if each application server has only its own cache (or replication of data between memcached), the entire installation will still have 1 GB of cache to replicate between each instance 。



Because memcached provides information through a network interface, a single client can access data from any memcached instance that it can access. If the data is not replicated across each instance, then a 3 GB RAM cache can eventually be available on each application server, as shown in Figure 7.




Figure 7. The correct use of multiple memcached instances



The problem with this approach is to choose which server to store the key/value pairs, and how to decide which memcached server to talk to when you want to regain a value. The solution to the problem is to ignore complex things, such as looking up tables, or looking at memcached servers to handle this process for you. The memcached client, however, must strive to be simple.



The memcached client does not have to decide this information, it simply uses a simple hashing algorithm for the key that is specified when the information is stored. When you want to store or retrieve information from a list of memcached servers, the memcached client obtains a numeric value from the key using a consistent hashing algorithm. For instance, the key MyKey is converted to a value of 23875. Whether to save or get information does not matter, this key will always be used as a unique identifier to load from the memcached server, so in this case, the "MyKey" hash corresponds to a value that is always 23875.



If there are two servers, the memcached client will perform a simple operation (for example, a factor) on this number to determine whether it should store this value on the first or second configured memcached instance.



When a value is stored, the customer has the opportunity to determine the hash value from this key and which server it was originally stored on. When a value is obtained, the client obtains the same hash value from the key and selects the same server for the information.



If you use the same list of servers on each application server (and in the same order), each application server will select the same server when you need to save or retrieve the same key. Now, in this example, 3GB of memcached space can be shared rather than replicated in the same 1 GB space, which brings more cache available and is likely to improve the performance of applications with multiple users.



This process also has its complexities (such as what happens when a server is unavailable), see related documentation for more information (see Resources). How can I not use memcached



Although memcached is simple, memcached instances are sometimes incorrectly used. memcached is not a database



The most common memcached misuse is to use it as a data store, not as a cache. The primary purpose of memcached is to speed up the response time of the data, otherwise it takes a long time to build or recover data from other data sources. A typical example is the recovery of information from a database, especially when information is formatted or processed before it is displayed to the user. Memcached is designed to store information in memory to avoid repeating the same tasks every time the data needs to be recovered.



You must not use memcached as the only source of information needed to run your application, and data should always be available from other sources of information. Also, remember that memcached is just a key/value store. You cannot execute queries on data, or iterate over the content to extract information. It should be used to store blocks of data or objects for bulk use. do not cache database rows or files



Although you can use memcached to store rows of data that are loaded from a database, this is actually a query cache, and most databases provide a mechanism for their own query caching. Other objects, such as file system images or files, are in the same situation. Many applications and Web servers already have some good solutions for this type of work.



If you use it to store all the pieces of information after loading and formatting, you can get more utility and performance improvements from memcached. Still, take our blog site for example, the best place to store information is to format the blog category as an object, even after formatting HTML. The construction of the blog page can be accomplished by loading various components (such as blog post, category list, post history, etc.) from the memcached and writing the completed HTML back to the client. memcached is not safe .



To ensure optimal performance, memcached does not provide any form of security, authentication, or encryption. This means that access to memcached servers should be handled in the same way by placing them on the same private side of the application deployment environment and, if security is necessary, by using unix®socket and allowing only applications on the current host to access the memcached server.



This has sacrificed some flexibility and resiliency, as well as the ability to share RAM caches across multiple machines on the network, but this is the only one by one solutions to ensure memcached data security in the current situation. don't limit yourself .



In addition to cases where memcached instances should not be used, the flexibility of memcached should not be overlooked. Because memcached is at the same architectural level as the application, it is easy to integrate and connect to it. And it's not complicated to change the application to take advantage of memcached. In addition, because memcached is only a cache, it does not stop application execution if a problem occurs. If used correctly, what it does is reduce the load on the rest of the server infrastructure (reduce read operations on databases and data sources), which means that more clients can be supported without more hardware.



But keep in mind that it's just a cache. Concluding remarks



In this article, we learned about memcached and how best to use it. We see how information is stored, how to choose a reasonable key, and how to select the information to store. We also discussed some of the key deployment issues that all memcached users will encounter, including the use of multiple servers, what to do when memcached instances die, and (perhaps most importantly) the circumstances in which memcached cannot be used.



As an Open-source application and a simple and straightforward application, memcached's functionality and practicality come from this simplicity. By providing huge RAM storage space for information, making it available on the network, and then making it accessible through a variety of interfaces and languages, memcached can be integrated into a wide variety of installations and environments.



reference materials



Learning The MySQL memcached documentation provides a lot of information about how to use memcached within a typical database deployment environment. Learn about IBM's business cache solution soliddb® by experiencing the IBM SolidDB product family. Keep an eye on developerWorks technical activities and webcasts. Check out the most recent worldwide seminars, trade shows, webcasts, and other events for IBM open source developers. Visit the DeveloperWorks Open source area for rich how-to information, tools and project updates, and most popular articles and tutorials to help you develop with open source technology and use them in conjunction with IBM products. View the free DeveloperWorks on demand demo, watch and learn about IBM and open source technology and product features.



access to product and technology memcached.org provides information about memcached and how to download and install it. Cache::memcached interface for Perl provides a wide range of interfaces. For Java technology, you can use the Com.danga.MemCached class, which provides some additional failover and multiple instance extensions. Use the IBM Product evaluation beta software to improve your next development project, which is available for download. Download IBM Product Evaluation Beta software or IBM SOA Sandbox for people and start using application development tools and middleware products from db2®, Lotus®, rational®, tivoli®, and websphere®.



discuss participating DeveloperWorks blogs to join the DeveloperWorks community. Welcome to join my DeveloperWorks Chinese community.

about the author



Martin Brown has been a professional writer for more than seven years. He is the author of numerous books and articles with a wide range of subjects. His expertise involves a variety of development languages, Peace console--perl, Python, Java™, JavaScript, Basic, Pascal, Modula-2, C, C + +, Rebol, gawk, Shellscript, Windows, Solaris, linux®, BeOS, Mac os/x, and so on, also involve WEB programming, system management, and integration. Martin is Microsoft® 's subject matter expert (SME) and a regular contributor to Serverwatch.com, linuxtoday.com, and IBM DeveloperWorks, who is also Computerworld, the An official blog from Apple blogs and other sites. You can contact him through his Web site: http://www.mcslp.com.



Turn from: http://www.ibm.com/developerworks/cn/opensource/os-memcached

This article is the use of B3log Solo from the simple design of the art of the synchronization of the release of the original address: http://88250.b3log.org/articles/2010/12/21/1292901251292.html


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.