With the rapid development of website technology today, the cache technology has become a key technology of large-scale website, the speed of a website that is directly related to the cache design, and the number of the purchase server, even affect the user's experience.
The site cache can be divided into client caches and server caches, depending on where they are stored.
Client-Side Caching
Client-side caching can also be categorized as: Browser cache, gateway, or proxy server cache
The gateway or proxy Server cache is the Web cache in the gateway server, and when multiple users access the same page, the page is routed directly from the gateway server to the user.
The browser cache is closest to the user's cache, if the cache is enabled, when the user accesses the same page, the page is no longer downloaded from the server, but the page is read from the local cache directory, and then the page is displayed in the browser.
Browser cache control, you can set the META tag, you can set the number, you can also set the time, as follows:
<meta http-equiv= "Expires" content= "3600″>
<meta http-equiv= "Expires" Content= "Wed, 1997 08:21:57 GMT" >
The HTTP header information is as follows:
http/1.1 OK
Date:fri, OCT 1998 13:19:41 GMT
server:apache/1.3.3 (Unix)
cache-control:max-age=3600, Must-revalidate
Expires:fri, OCT 1998 14:19:41 GMT
Last-modified:mon, June 1998 02:28:12 GMT
But now the Web site in order to ensure that users access to the latest content, generally rarely use browser cache, instead of a more flexible server cache.
Server-Side Caching
Server-side caching is divided into: page cache, data cache, database cache
1. Page cache
Page cache is the dynamic page directly generated static page on the server side, when the user calls the same page, the static page will be downloaded directly to the client, no longer need to run through the program and database access, greatly saving the load of the server.
Many of the early sites used the publishing system to do this, consolidating data and page templates into static pages when published in the background, and storing them on the hard drive. But this flaw is obvious, one is the background of the program is very complex, and the other is the control of the cache can only be controlled by man-made way, this is a nightmare for some of the most frequently updated sites, the site may be constantly doing cache deletion and reconstruction. Of course, there are some frameworks that automatically update these caches, such as PHP's smarty template technology, which can define when the cache expires and automatically update the caches. This is true for some information publishing sites.
In addition to caching the entire page, there is a technique called "Web Fragment Caching technology" that caches parts of the page rather than all of it. The representative has ESI cache.
2. Data cache
But when WEB2.0 the rise of today, the release of information is no longer the unified distribution of the administrator, but all the users are publishing information, users post information is of course want to see this information, rather than wait until the cache time to refresh to see the data, so the data cache related technology has emerged.
The more famous data caching frameworks are Ehcache and memcached.
Ehcache has a lot of cache branches (including the page cache module), but the core of the module is also its data cache part, for example, when Ehcache and hibernate integration, can be queried out of the collection of objects into memory, next time if you query this query, will return this data collection directly from memory, no need to query the database, at the same time, you can configure the cache refresh mode, there are several modes of read-only,nonstrict-read-write,read-write, Where read-only indicates that the cache is not refreshed (to refresh only restart), Nonstrict-read-write indicates that the refresh is not timely, you can set the time-out to refresh, read-write indicates that the cache will be refreshed when the data changes. Specific how to configure may be based on the specific business.
Memcached The general principle is the same as Ehcache, the data in the form of key values in memory, using the query can be MD5 as a key, the results of the query as a value. Relative to Ehcache, memcached is a tool, Ehcache is a framework, memcached is more flexible, of course, you have to write the corresponding code to use it.
This is an online memcached diagram that illustrates the position of memcached in the system.
In recent years, the emergence of NoSQL technology, although now attributed to a database, but its essence is a cache technology and database technology of a fusion product.
The current caching approach is divided into two modes:
Memory cache: Cached data is stored in the server's memory space.
Advantages: Fast Speed disadvantage: limited resources
File cache: Cached data is stored in the server's hard disk space.
Advantages: Large Capacity disadvantages: slow speed, especially when the number of caches is huge
Database Cache
Database caching is typically provided by a database, such as Oracle, where tables can be cached to improve access to frequently accessed data.
Summarize
How to use the cache, which level of caching, is determined by the specific business of the website itself. One of the principles of caching technology is to get the data closer to the user. Caching technology is a profound art, I also only know some fur.
The recent project team has been useful to these three caches, go to their respective official look under, feel really different! Today deliberately summed up the advantages and disadvantages of each cache, for reference only!
Ehcache
Widely used in Java projects. It is an open source, designed to increase the high cost of data removed from the RDBMS, high latency to take a caching scheme. Because Ehcache is robust (Java-based development), certified (with Apache 2.0 license), full of features (described later), it is used in various nodes of large, complex distributed Web application.
What features?
1. Fast enough
Ehcache's release has been a long time, after several years of effort and countless performance tests, Ehcache was eventually designed in large, high concurrency systems.
2. Simple enough
The interface provided by the developer is very simple and straightforward, and it only takes a few minutes of your time to build from Ehcache to use. In fact, many developers do not know that they use Ehcache,ehcache is widely used in other open source projects
For example: Hibernate
3. Compact Enough
On this feature, the official gave a very cute name small foot print, the general Ehcache release version will not go to 2m,v 2.2.3 668KB.
4. Enough light weight
The core program relies only on SLF4J this one package, not one!
5. Good expansion
Ehcache provides memory and hard disk storage for big data, recent versions allow multiple instances, save objects with high flexibility, LRU, LFU, FIFO culling algorithms, base properties to support hot provisioning, multiple plugins supported
6. Listener
Cache Manager Listener (Cachemanagerlistener) and Cache Listener (Cacheevenlistener), do some statistics or data consistency broadcast very useful
How to use?
Simple enough is a major feature of Ehcache, naturally used up just so easy!
Post a basic code of use
Cache (2); Cachemanager.addcache (cache);
There is a ehcache. xml file in the code, so let's introduce some of the properties in this file.
- Name: Cache names.
- Maxelementsinmemory: Maximum number of caches.
- Eternal: The object is permanently valid, but if set, timeout will not work.
- Timetoidleseconds: Sets the allowable idle time (in seconds) for an object before it expires. An optional property is used only if the Eternal=false object is not permanently valid, and the default value is 0, which means that the idle time is infinite.
- Timetoliveseconds: Sets the time that an object is allowed to survive before it expires, with a maximum time between creation time and expiration time. Used only when the Eternal=false object is not permanently valid, the default is 0, which means that the object survives indefinitely.
- Overflowtodisk: When the number of objects in memory reaches Maxelementsinmemory, Ehcache writes the object to disk.
- DISKSPOOLBUFFERSIZEMB: This parameter sets the buffer size of the Diskstore (disk cache). The default is 30MB. Each cache should have its own buffer.
- Maxelementsondisk: Maximum number of hard disk caches.
- Diskpersistent: Whether to cache VM Restart period data Whether The disk store persists between restarts of the virtual machine. The default value is False.
- Diskexpirythreadintervalseconds: Disk failed thread run time interval, default is 120 seconds.
- Memorystoreevictionpolicy: When the maxelementsinmemory limit is reached, Ehcache will clean up the memory according to the specified policy. The default policy is LRU. You can set it to FIFO or LFU.
- Clearonflush: If the maximum amount of memory is cleared.
Memcache
Memcache is a high-performance, distributed object caching system, originally designed to mitigate the latency of Dynamic Web database loading data, you can think of it as a large memory hashtable, is a Key-value key-value cache. Danga Interactive in order to LiveJournal the development of the BSD license released a set of open-source software.
1. Reliance
Written in the Memcache C language, relies on the most recent version of GCC and libevent. GCC is its compiler, and colleagues do socket IO based on libevent. Ensure that your system colleague has both environments when installing memcache.
2. Multithreading support
Memcache supports multiple CPUs at the same time, under the Memcache installation file There is a named Threads.txt in the special instructions, by default, Memcached is compiled as a single-threaded Application. The default is a single-threaded compilation installation, which needs to be modified if you need multi-threading./configure–enable-threads, in order to support multicore systems, if your system must have multithreaded working mode. The number of threads that turn on multithreaded work is 4 by default, and if the number of threads exceeds the number of CPUs, the probability of an operation deadlock occurs. Combine your business model selection to make the most of what you do.
3. High Performance
Through libevent to complete the socket communication, theoretically the bottleneck of performance falls on the NIC.
Simple installation:
1. Download the memcached and libevent separately and put them in the/tmp directory:
# cd/tmp
# wget http://www.danga.com/memcached/dist/memcached-1.2.0.tar.gz
# wget http://www.monkey.org/~provos/libevent-1.2.tar.gz
2. Install Libevent First:
# tar ZXVF libevent-1.2.tar.gz
# CD libevent-1.2
#./CONFIGURE-PREFIX=/USR
# make (GCC is installed first if you encounter a hint that GCC is not installed)
# make Install
3. Test whether the Libevent is installed successfully:
# Ls-al/usr/lib | grep libevent
lrwxrwxrwx 1 root root 21 11?? 17:38 Libevent-1.2.so.1-libevent-1.2.so.1.0.3
-rwxr-xr-x 1 root root 263546 11?? 17:38 libevent-1.2.so.1.0.3
-rw-r-r-1 root root 454156 11?? 17:38 LIBEVENT.A
-rwxr-xr-x 1 root root 811 11?? 17:38 libevent.la
lrwxrwxrwx 1 root root 21 11?? 17:38 libevent.so-libevent-1.2.so.1.0.3
Good, all installed.
4. Install the memcached and require the installation location of the specified libevent in the installation:
# cd/tmp
# tar ZXVF memcached-1.2.0.tar.gz
# CD memcached-1.2.0
#./CONFIGURE-WITH-LIBEVENT=/USR
# make
# make Install
If there is an error in the middle, please carefully check the errors, follow the error message to configure or add the appropriate library or path.
When the installation is complete, the memcached will be put into/usr/local/bin/memcached,
5. Test whether the memcached is installed successfully:
# ls-al/usr/local/bin/mem*
-rwxr-xr-x 1 root root 137986 11?? 17:39/usr/local/bin/memcached
-rwxr-xr-x 1 root root 140179 11?? 17:39/usr/local/bin/memcached-debug
Start the Memcache service
To start the memcached service:
1. Start the server side of the memcache:
#/usr/local/bin/memcached-d-M 8096-u root-l 192.168.77.105-p 12000-c 256-p/tmp/memcached.pid
The-D option is to start a daemon,
-M is the amount of memory allocated to Memcache, in megabytes, I'm 8096MB,
-U is the user running memcache, I am root here,
-L is the server IP address of the listener, if there are multiple addresses, I specify the server IP address 192.168.77.105,
-P is the port that sets Memcache listening, I set here 12000, preferably more than 1024 ports,
The-c option is the maximum number of concurrent connections to run, the default is 1024, I set the 256 here, according to the load of your server to set,
-P is set to save memcache PID file, I am here to save in/tmp/memcached.pid,
2. If you want to end the memcache process, execute:
# Cat/tmp/memcached.pid or Ps-aux | grep memcache (find the corresponding process ID number)
# Kill Process ID Number
You can also start multiple daemons, but the ports cannot be duplicated.
Connection to the Memcache
Telnet IP Port
Note that before connecting, you need to memcache the server to add the Memcache firewall rules.
-A rh-firewall-1-input-m state–state new-m tcp-p tcp–dport 3306-j ACCEPT
Reload firewall rules
Service Iptables Restart
OK, now we should be able to connect to the memcache.
View memcache status information on client input stats
PID Memcache Process ID of the server
Uptime number of seconds the server has been running
Time Server Current UNIX timestamp
Version Memcache versions
Pointer_size the current operating system pointer size (32-bit system is generally 32bit)
Cumulative user time for the rusage_user process
Cumulative system time for the Rusage_system process
Curr_items the number of items currently stored by the server
Total_items The total number of items stored since the server was started
Bytes The number of bytes occupied by the current server storage items
Curr_connections the number of connections currently open
Total_connections number of connections that have been opened since the server was started
Connection_structures number of connection constructs allocated by the server
Cmd_get get Command (GET) total number of requests
Cmd_set set Command (SAVE) Total Request count
Get_hits Total Hit Count
Total number of get_misses misses
Evictions the number of items deleted for free memory (the space allocated to memcache needs to be removed when the old items are filled to get space allocated to the new items)
Bytes_read bytes read (number of requests bytes)
Bytes_written total number of bytes sent (result bytes)
Limit_maxbytes The amount of memory allocated to Memcache (bytes)
Threads Current number of threads
Redis
Redis is written after memcache, we often compare the two, if it is a key-value store, but it has a rich data type, I want to temporarily call it the cache Data Flow Center, like the current logistics center, order, package, Store, classification, distribute, end. Now the popular lamp PHP architecture does not know the performance comparison with Redis+mysql or Redis + MongoDB (listen to people in the group that MongoDB Shard is unstable).
Let's talk about the characteristics of Reidis.
1. Support Persistence
Redis's local persistence supports two ways: Rdb and aof. The RDB configures persistent triggers in the redis.conf configuration file, aof means that Redis does not add a record to the persisted file (which is the generated command to save the record), and if it is not used with Redis, it will not open aof, the data is too large, Restarting recovery is a huge project!
2. Rich data types
Redis supports String, Lists, sets, sorted sets, hashes multiple data types, Sina Weibo uses Redis to do nosql mainly it has these types, time sorting, functional sequencing, my Weibo, the list of these features sent to me and Sorted set
Powerful operational capabilities.
3. High Performance
This is very much like memcache. The level of memory operation is more natural and efficient than the second-level operation of hard disk operations, with less head seeking, data reading, and paging, and high overhead operations! That's why NoSQL comes out, it's supposed to be high performance.
is an RDBMS-based derivative, although the RDBMS also has a cache structure, but is always at the app level is not what we want to manipulate.
4.replication
Redis provides a master-slave replication scheme, like MySQL incremental replication and replication of the implementation are very similar, this copy is a bit like aof copy is a new record command, the main library new records will be sent to the new script from the library, from the library to generate records according to the script, the process is very fast, look at the network, General master and Slave are in the same LAN, so it can be said that Redis's master-slave approximation in time synchronization, colleagues it also supports a master more from, dynamically add from the library, the number of libraries is not limited. Master-Slave Library building, I think the use of network mode, if using a chain (master-slave-slave-slave-slave If the first slave a downtime restart, first receive data recovery script from master, this is blocked, if the main library data a few terabytes recovery process will take some time, in this process other slave can not synchronize with the main library.
5. Update fast
This seems to me that from my exposure to Redis so far has been issued a large version of 4, the small version has not been counted. Redis author is a very positive person, whether it is a mail question or forum post, he can promptly patiently answer for you, maintenance is very high. Someone to maintain, let us use also worry and rest assured. At present, the main development direction of Redis is the cluster direction of Redis.
Installation of Redis
Redis installation is actually quite simple, on the whole three steps: Download the tar package, unzip the tar package, install.
But recently I 2.6.7 with a CentOS 5.5 32bit when encountered an installation problem, the following I use the picture to share the installation process encountered problems, in the Redis folder to perform make when there is a following error undefined reference to ' __sync_ add_and_fetch_4′
Got a lot of surfing on the Internet. Finally find a solution in https://github.com/antirez/redis/issues/736, write cflags=-march=i686 on Src/makefile head!
Remember to delete the file you just failed to install, re-unzip the new installation file, modify the makefile file, and then make the installation. You won't find that mistake.
Some properties comments and basic type operations on Redis in the previous Redis appetizer, there is a detailed explanation, here will not repeat the cumbersome (in essence, want to lazy, haha!). )
Finally, put Memcache and Redis together have to let people think of the comparison, who fast who easy to use Ah, the group has been fighting for this thing for a long time, I will see me here to share with you.
After someone has sent a memcache performance that is much better than Redis, the Redis author Antirez published a blog post on how to do stress testing for Redis and memcache, a person who says that many open source software should be thrown into the toilet, Because their stress test script is 2, the author explains it. Redis vs Memcache is definitely a apple to Apple comparison. Hehe, it is clear that the comparison between the two is not a bit of egg pick bone effect, the author in the same operating environment to do three times to test how good value, get the results such as:
What needs to be stated is that the data in the process of the single core processing of this test, Memcache is to support multi-core multithreading operations (default is not open) so by default it has reference meaning, if it is memcache faster than Redis. So why does Redis not support multithreading multi-core processing? The author also published a bit of his own view, first of all, multi-threaded fixed bug fix, in fact, is not easy to extend the software, there is data consistency problem because all operations of Redis is atomic operation, the author uses a word nightmare nightmare, hehe! Of course, do not support multi-threaded operation, there must be his shortcomings, such as the performance must certainly be poor, the author from the 2.2 version of the focus on the development of Redis cluster to alleviate its performance shortcomings, plainly speaking is not vertical, horizontal improvement.
Web site Caching Technology Summary (Ehcache, memcache, redis contrast)