About half a year ago, guang.com once occurred due to the first page part of the cache failure, resulting in site failure.
Fault Analysis:
At that time stroll is doing promotion, traffic suddenly increased, QPS reached 5000 +, when the first part of the cache failure, the need to query db, but because this part of the business logic is very complex resulting SQL contains multiple table joins, GroupBy, and so on, the execution needs 1s, resulting in a large number of temporary tables, In-memory are not fit, become on-disk temporary table, but then put temporary table disk partition capacity only 20G, soon disk also exploded, the result obviously website can't open.
Summary of the points:
1. Insufficient SQL statement optimization
2, MYSQL tmp_table_size configuration is too small
3, disk partition unreasonable/tmpdir path configuration unreasonable
4, lack of communication between departments, large-scale promotion without prior greeting.
Interim measures:
Because at that time a large number of user access, query db has been hang, causing the cache has been unable to set back, home that cache has been in the Miss State, vicious circle, Avalanche.
At that time we immediately took the following measures:
1, adjust MySQL tmp_table_size, about tmp_table_size Please see the detailed description below.
2. Modify the MySQL temp table save path (Tmpdir) to a larger partition
3, simplify the business logic, modify SQL, redeploy.
Temporary tables use memory (tmp_table_size): MySQL may need to use a temporary table when we perform special operations such as Join,order By,group by and so on that require temporary tables to be used. When our temporary table is smaller (less than the size set by the tmp_table_size parameter), MySQL creates the temporary table as an in-memory temporary table, which MySQL will create only if the size set by Tmp_table_size cannot be loaded with the entire temporary table. The table for the MyISAM storage engine is stored on disk. However, when the size of another system parameter max_heap_table_size is less than tmp_table_size, MySQL will use the Max_heap_table_size parameter to set the size of the largest memory temporary table, ignoring TMP_ The value set by the Table_size. And the Tmp_table_size parameter starts with MySQL 5.1.2, and has been using max_heap_table_size before.
Long-term solution: Finally to the focus of this paper cache reload mechanism design and implementation
Before you talk about the cache reload mechanism design and implementation, let's look at the cache Update method:
1, is the cache time out, let cache invalidation, re-check. (Passive update)
2, is updated by the back-end notification, a volume of back-end changes, notify the front-end update. (Active update)
The former is suitable for real-time, but it is frequently updated, and the latter is suitable for applications with high real-time requirements and less frequent updates.
Cache Reload Mechanism Design:
According to the business needs, choose Passive Update mode, but the disadvantage of this approach is when the cache failure that point, just in case of high concurrency, the above avalanche will occur.
So I was thinking about this high-usage cache, you don't have to set the time out or times out setting to be large enough, and then periodically reload/refresh the cache data from DB at the business requirement interval, and the cache doesn't fail. There is no avalanche phenomenon.
Guang.com is a small piece of architecture about the cache reload:
Main 2 Step:
1, will need to reload cache wrapper saved to Redis Hash.
2, deployed on the daemon server cachereloadjob, every minute to redis take the need to reload the cache of HashMap, determine whether to the time refresh cache, if to, through the reflection call Relevant method re-reload data and reset this cache.
Cache Reload Mechanism implementation:
Set memcached with reload mechanism if necessary:
/** * Cachereloadjob:
public class Cachereloadjob {private static Logger Logger = Loggerfactory.getlogger (Cachereloadjob.class); @Autowired myxmemcachedclient myxmemcachedclient; @Resource (name= "objecthashoperations") private hashoperations<string, String, methodinvocationwrapper> Objecthashoperations; public void Reloadcache () {logger.info ("Try to reload Cache"); map<string, methodinvocationwrapper> map = objecthashoperations.entries (RedisKeyEnum.CACHE_RELOAD.getKey ()); Threadfactory tf = new Namedthreadfactory ("Cache_reload_threadpool"); Executorservice ThreadPool = Executors.newfixedthreadpool (Runtime.getruntime (). Availableprocessors (), TF); For (String Key:map.keySet ()) {final Methodinvocationwrapper wrapper = Map.get (key); if (Wrapper.getwritetime () +wrapper.getduration () >system.currenttimemillis ()) {//Refresh time is greater than current time threadpool.exec Ute (new Runnable () {@Override PUBlic void Run () {RefreshCache (wrapper); } }); }} logger.info ("Completed with reloaded cache"); } private void RefreshCache (Methodinvocationwrapper wrapper) {Object object = Reflectionutils.invokemethod (Sprin Gcontextholder.getbean (Wrapper.getobjectname ()), Wrapper.getmethodname (), Wrapper.getparametertypes (), Wrapper.getargs ()); Myxmemcachedclient.set (Wrapper.getkey (), Wrapper.getexpiredtime (), object); Wrapper.setwritetime (System.currenttimemillis ()); Objecthashoperations.put (RedisKeyEnum.CACHE_RELOAD.getKey (), Wrapper.getkey (), wrapper); }}
Redis Storage Fabric
redis> hset cache:reload:memcached <memcache_key> <MethodInvocationWrapper>OKredis> hgetall cache: Reload:memcached
PostscriptIf you want to do a more humane point, the following can be added in the Site management system cache reloadable management Tools (delete, modify refresh interval, etc.).
Reprinted from: http://kenny7.com/2012/10/cache-reload-mechanism.html
Cache Avalanche Effect