You don't know this yet? Then you go to the interview to lose, hurriedly go to the official website to look at http://www.danga.com/memcached/, in addition to Google use, hard drive is always too slow, the data exists in memory, if you have only one server, recommend using APC ( Facebook in use) or Eaccelerator or XCache (developed by the Chinese), these products stand out better, if you need a distributed caching scheme, then use memcached.
Ii. How does memcached work alongside MySQL?
We all know that MySQL has a query cache, can cache the results of the last query, but in fact, can not help too much, the following is the lack of MySQL quety cache:
Iv. Caching Technology of Fotolog
- Nondeterministic cache You're not sure if you want the data cache, you don't know if it's out of date, so you're tentatively asking memcached, what data do you have? I don't want to expire data ah, memcached told you have and give you, you are happy, if not, you have to get from the database or somewhere else, this is memcached typical application. The main applications are:
1. Complex data needs to be read multiple times, your database is fragmented, fetching data from multiple databases and combining it is a very big overhead, you can take this data out and then save it to memcached.
2.mysql Query Cache A good alternative, so that the other parts of the database changed, as long as they did not change the problem (note the database update problem, will be mentioned later)
3. Cache a relationship or list, such as a list of multiple articles under a column
4. Calls from multiple pages and gets slow data, or updates slow data, such as article browsing leaderboard
5. If the cache cost exceeds the cost of re-acquisition, do not cache it.
6. Tag cloud and auto-suggestion (similar to Google sugest)
For example: When a user uploads an image, the user's friend page lists the image, so cache it.
Potential problems:
Memcached consumes a lot of server memory, CPU consumption is very small, so fotolog to deploy memcached on their application server (it seems like we do), they encounter the CPU to get 90% utilization (how can it be so high?). What's wrong with it), memory recycling (This is a big problem) and so on.
- The state cache applies the current state of the application service to the memcached in the main application:
1. "Expensive" operation, high overhead operation
2.sessions session, Flickr the session exists in the database, personal feeling or save memcached compared "cheap" some, if the memecached server down, then log back in it.
3. Record the user's online information (we do the same)
- deterministic cache for the entire content of certain databases, are cached to memcached, there is a dedicated application service to ensure that the data you want in memcached, other application services directly from the memcached to fetch data without going to the database, Because the databases are all saved to memcached and remain in sync. The main applications are:
1. Read stretch, all reads are obtained from memcached, the database has no load
2. " Never expire "(relative) data, such as administrative planning data, is very small.
3. Frequently called content
4. User's authentication information
5. User's profile
6. User's preferences
7. The user's current list of media files, such as the user's picture
8. User login, do not go to the database, only walk memcached (personally think this is not very good, login information or need to persist, with similar bdb this effect is also good)
Usage:
1. Multiple dedicated cache pools instead of a large cache server, multiple cache pools guarantee high availability, one cache instance hangs away from other cache instances, all of the cache instances are hung up, and the database (estimated database is ^_^)
2. All cache pools are maintained by programs. For example, when the database is updated, the program automatically synchronizes the updated content to multiple cache instances
3. After the server restarts, the cache is started before the website, which means that when the site is started, all caches are available
4. Read requests can be load balanced into multiple cache instances, High performance and reliability
Potential problems:
1. You need enough memory to store so much data
2. Data is recorded in rows, while memcached stores data as objects, your logic converts rows and columns into cached objects
3. To maintain multiple cache instances, Fotolog with Java/hibernate, who wrote a client to poll
4. Managing multiple cache instances can increase the cost of the application, but these costs are nothing compared to the benefits of multiple caches
- Active cache data magically appear in the cache, when the database is updated, the cache is immediately populated, the updated data is more likely to be called (such as a new article, see more people of course), a variant of the non-deterministic cache (the original is it's non-deterministic Caching with a twist. I think it's a weird translation. The main applications are:
1. Pre-populated cache: let memcached call MySQL as little as possible if the content does not show.
2. "Preheat" the cache: when you need to replicate across the data center
Steps to use:
1. Parse the binary log of the database update and find the same update to memcached when the database is updated
2. Execute the user-defined function, set the trigger to invoke the UDF update, specific reference http://tangent.org/586/Memcached_Functions_for_MySQL.html
3. Using the Blackhole strategy, Facebook also uses the MySQL blackhole storage engine to populate the cache, writing Blackhole data to the cache, which Facebook uses to set data revocation and cross-border replication, The advantage is that database replication does not go to MySQL, which means there is no binary log and less CPU usage (ah?). Do you store binary logs through memcached and then copy them to a different database? Experienced comrades can be added to this topic. )
- The file system cache caches files directly in the memcached, wow, it's enough for BT to lighten the burden of NFS, and it's estimated to cache only those images that are too popular.
- Partial page content caching if some parts of the page are hard to get, caching the page's original data is not as good as caching the contents of the page directly.
- Application-level replication updates the cache through the API with the following details: 1. An application writes data to a cache instance, which copies the content to another cache instance (memcached synchronization)
2. Automatically get the cache pool address and the number of instances
3. Update multiple cache instances at the same time
4. If a cache instance is down, skip to the next instance until the update succeeds
The process is very efficient and low overhead
- Other tricks 1. Multi-node for "single point of Failure" 2. Using hot standby technology, when a node is down, another service is automatically replaced with its IP, so the client does not have to update the IP address of memcached
3.memcached can be accessed via TCP/UDP, continuous connection reduces load and system is designed to withstand 1000 connections at a time
4. Different application services, different cache server farms
5. Check that your data size matches the cache you allocated, and see http://download.tangent.org/talks/Memcached%20Study.pdf for more information.
6. Do not consider data row caching, caching complex objects
7. Do not run memcached on your database server, two are the monsters that eat memory
8. Do not be disturbed by TCP latency, local TCP/IP is optimized for memory replication
9. Process data in parallel as much as possible
10. Not all memcached clients are the same, look closely at the language you use (as if PHP and memcached fit well)
11. Whenever possible, the data expires rather than invalidates the data, memcached can set the expiration time
12. Select a good cache ID key, such as update with the version number
13. Store the version number in the memcached
The author's final speech I will not translate, seemingly MySQL proxy is doing a project, automatic synchronization MySQL and memcached, more reference
Make memcached and MySQL work better.