Make memcached and MySQL work better.

Source: Internet
Author: User
Tags memcached mysql query cpu usage server memory

Source: Http://chaoqun.17348.com/2008/08/memcached_work_with_mysql

This is the experience of Fotolog, the legend of the larger site than Flickr, Fotolog on 21 servers deployed 51 memcached instances, a total of 254G cache space available, cache up to 175G of content, this number than many Web site database is much larger, The original text is a Bunch of great strategies for Using Memcached and MySQL Better Together, I am still selective translation here as well as in my understanding, thanks to Todd Hoff, always give us some study of the case Example, from here can also see the foreign technology open attitude, not like us, in fact, so little 99 is still hidden tucked, good, into the subject.

First, about memcached

You don't know this yet? Then you go to the interview to lose, hurriedly go to the official website to look at http://www.danga.com/memcached/, in addition to Google use, hard drive is always too slow, the data exists in memory, if you have only one server, recommend using APC ( Facebook in use) or Eaccelerator or XCache (developed by the Chinese), these products stand out better, if you need a distributed caching scheme, then use memcached.

Ii. How does memcached work alongside MySQL?

    • Database shard To solve the problem of database write extension to the database shards, deployed to different servers, so that there is only one primary server, write operations become the bottleneck and possible "single point of failure", the general Database Shard is mainly in accordance with the business to divide, as far as possible to split the business, irrelevant are independent to make services
    • Front-end MySQL and a bunch of memcached servers to cope with the read problem the application first gets data from the memcached, gets it out of the database and saves it in Memcached, and has previously read an article about the application 95% of the data obtained from Memcache , 3% of the data from the MySQL query cache, the remaining 2% to check the table, compare your application, how far is the gap?
    • Solve read problems with MySQL replication (master-slave)
      First, the MySQL database is separated by Master-slave read-write, multiple slave to respond to application read operations.

Third, why not use the MySQL query cache?

We all know that MySQL has a query cache, can cache the results of the last query, but in fact, can not help too much, the following is the lack of MySQL quety cache:

    • There can only be one instance
      means that the upper limit of what you can store is your server's available memory, how much memory can a server have? How much can you save?
    • MySQL's query cache fails as long as it has a write operation
      As long as the database content changes slightly, it is afraid to change the other rows, MySQL query cache will also expire
    • MySQL query cache can only cache database data rows
      means nothing else, such as an array, such as an object, and memcached can theoretically cache anything, even files ^_^

Iv. Caching Technology of Fotolog

  • Nondeterministic cache You're not sure if you want the data cache, you don't know if it's out of date, so you're tentatively asking memcached, what data do you have? I don't want to expire data ah, memcached told you have and give you, you are happy, if not, you have to get from the database or somewhere else, this is memcached typical application. The main applications are:

    1. Complex data needs to be read multiple times, your database is fragmented, fetching data from multiple databases and combining it is a very big overhead, you can take this data out and then save it to memcached.

    2.mysql Query Cache A good alternative, so that the other parts of the database changed, as long as they did not change the problem (note the database update problem, will be mentioned later)

    3. Cache a relationship or list, such as a list of multiple articles under a column

    4. Calls from multiple pages and gets slow data, or updates slow data, such as article browsing leaderboard

    5. If the cache cost exceeds the cost of re-acquisition, do not cache it.

    6. Tag cloud and auto-suggestion (similar to Google sugest)

    For example: When a user uploads an image, the user's friend page lists the image, so cache it.

    Potential problems:

    Memcached consumes a lot of server memory, CPU consumption is very small, so fotolog to deploy memcached on their application server (it seems like we do), they encounter the CPU to get 90% utilization (how can it be so high?). What's wrong with it), memory recycling (This is a big problem) and so on.

  • The state cache applies the current state of the application service to the memcached in the main application:

    1. "Expensive" operation, high overhead operation

    2.sessions session, Flickr the session exists in the database, personal feeling or save memcached compared "cheap" some, if the memecached server down, then log back in it.

    3. Record the user's online information (we do the same)

  • deterministic cache for the entire content of certain databases, are cached to memcached, there is a dedicated application service to ensure that the data you want in memcached, other application services directly from the memcached to fetch data without going to the database, Because the databases are all saved to memcached and remain in sync. The main applications are:

    1. Read stretch, all reads are obtained from memcached, the database has no load

    2. " Never expire "(relative) data, such as administrative planning data, is very small.

    3. Frequently called content

    4. User's authentication information

    5. User's profile

    6. User's preferences

    7. The user's current list of media files, such as the user's picture

    8. User login, do not go to the database, only walk memcached (personally think this is not very good, login information or need to persist, with similar bdb this effect is also good)

    Usage:

    1. Multiple dedicated cache pools instead of a large cache server, multiple cache pools guarantee high availability, one cache instance hangs away from other cache instances, all of the cache instances are hung up, and the database (estimated database is ^_^)

    2. All cache pools are maintained by programs. For example, when the database is updated, the program automatically synchronizes the updated content to multiple cache instances

    3. After the server restarts, the cache is started before the website, which means that when the site is started, all caches are available

    4. Read requests can be load balanced into multiple cache instances, High performance and reliability

    Potential problems:

    1. You need enough memory to store so much data

    2. Data is recorded in rows, while memcached stores data as objects, your logic converts rows and columns into cached objects

    3. To maintain multiple cache instances, Fotolog with Java/hibernate, who wrote a client to poll

    4. Managing multiple cache instances can increase the cost of the application, but these costs are nothing compared to the benefits of multiple caches

  • Active cache data magically appear in the cache, when the database is updated, the cache is immediately populated, the updated data is more likely to be called (such as a new article, see more people of course), a variant of the non-deterministic cache (the original is it's non-deterministic Caching with a twist. I think it's a weird translation. The main applications are:

    1. Pre-populated cache: let memcached call MySQL as little as possible if the content does not show.

    2. "Preheat" the cache: when you need to replicate across the data center

    Steps to use:

    1. Parse the binary log of the database update and find the same update to memcached when the database is updated

    2. Execute the user-defined function, set the trigger to invoke the UDF update, specific reference http://tangent.org/586/Memcached_Functions_for_MySQL.html

    3. Using the Blackhole strategy, Facebook also uses the MySQL blackhole storage engine to populate the cache, writing Blackhole data to the cache, which Facebook uses to set data revocation and cross-border replication, The advantage is that database replication does not go to MySQL, which means there is no binary log and less CPU usage (ah?). Do you store binary logs through memcached and then copy them to a different database? Experienced comrades can be added to this topic. )

  • The file system cache caches files directly in the memcached, wow, it's enough for BT to lighten the burden of NFS, and it's estimated to cache only those images that are too popular.
  • Partial page content caching if some parts of the page are hard to get, caching the page's original data is not as good as caching the contents of the page directly.
  • Application-level replication updates the cache through the API with the following details: 1. An application writes data to a cache instance, which copies the content to another cache instance (memcached synchronization)

    2. Automatically get the cache pool address and the number of instances

    3. Update multiple cache instances at the same time

    4. If a cache instance is down, skip to the next instance until the update succeeds

    The process is very efficient and low overhead

  • Other tricks 1. Multi-node for "single point of Failure" 2. Using hot standby technology, when a node is down, another service is automatically replaced with its IP, so the client does not have to update the IP address of memcached

    3.memcached can be accessed via TCP/UDP, continuous connection reduces load and system is designed to withstand 1000 connections at a time

    4. Different application services, different cache server farms

    5. Check that your data size matches the cache you allocated, and see http://download.tangent.org/talks/Memcached%20Study.pdf for more information.

    6. Do not consider data row caching, caching complex objects

    7. Do not run memcached on your database server, two are the monsters that eat memory

    8. Do not be disturbed by TCP latency, local TCP/IP is optimized for memory replication

    9. Process data in parallel as much as possible

    10. Not all memcached clients are the same, look closely at the language you use (as if PHP and memcached fit well)

    11. Whenever possible, the data expires rather than invalidates the data, memcached can set the expiration time

    12. Select a good cache ID key, such as update with the version number

    13. Store the version number in the memcached

The author's final speech I will not translate, seemingly MySQL proxy is doing a project, automatic synchronization MySQL and memcached, more reference

Http://www.cnblogs.com/cy163/archive/2009/08/12/1544127.html

Make memcached and MySQL work better.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.