Author Chen Yaahua article reprint exchange please contact [email protected]
This paper mainly introduces the related theory of cache in large-scale distributed system, the common caching components and the application scenarios.
1 Caching Overview
2 Classification of the cache
The cache is mainly divided into the following four categories
2.1 CDN CacheBasic Introduction
The basic principle of CDN (content Delivery Network) is that various cache servers are widely used to distribute these cache servers to regions or networks where users have access to a relatively centralized network, and when users visit a website, Leverage global load technology to point users ' access to the closest working cache server, which responds to user requests directly from the cache server
Application Scenarios
Main cache static resources, examples, video
application Diagram
Advantages
2.2 Reverse Proxy CacheBasic Introduction
The reverse proxy is located in the application server room and handles all requests to the Web server. If a user requests a page that has buffering on the proxy server, the proxy server sends the buffered content directly to the user. If there is no buffering, a request is made to the Web server, the data is retrieved, and the local cache is sent to the user. Reduces the load on the Web server by reducing the number of requests to the Web server.
Application Scenarios
Generally cache only small static file resources, such as CSS, JS, pictures
application Diagram
Open Source Implementation
2.3 Local App CacheBasic Introduction
Refers to the cache component in the application, the biggest advantage is that the application and the cache is within the same process, the request cache is very fast, there is no excessive network overhead, etc., in a single application does not require cluster support or cluster in the case of each node without mutual notification of the scenario of the use of local cache is more appropriate; Its disadvantage is that the cache should be coupled with the application, multiple applications cannot directly share the cache, each application or cluster nodes need to maintain their own separate cache, memory is a waste.
Application Scenarios
Common data such as cache dictionaries
Cache Media
ImplementDirect implementation of programming
EHCACHEBasic Introduction
Ehcache is?? A standard-based, open-source cache that improves performance, unloads databases, and simplifies scalability. It is the most widely used Java-based cache because it is powerful, validated, fully functional, and integrates with other popular libraries and frameworks. Ehcache can be extended from in-process cache to hybrid in-process/out-of-process deployments using terabytes of cache
Application Scenarios
Ehcache Architecture Diagram
main characteristics of Ehcache
ehcache Cache data Expiration policy
Ehcache Expiration Data elimination mechanism
Lazy elimination Mechanism: each time the cache into the data, there will be a period, in the read time and set the time to do a TTL comparison to determine whether the expiration
Guava CACHE2.4 Distributed CacheBasic Introduction
Guava cache is a caching tool in the Google Open source Java Reuse Toolset Library Guava
Features and Functions
Application Scenarios
data structure diagram
Cache Update Policy
Cache Recycling Policy
2.4 Distributed Cache
Refers to a cache component or service that is decoupled from the application, with the greatest benefit of being a standalone application that is isolated from the local application and that multiple applications can share the cache directly.
Main application Scenarios
Main access Mode
The following describes the 2 major open source implementations of distributed cache memcached and Redis
MemcachedBasic Introduction
Memcached is a high performance, distributed memory object caching system that can be used to store data in a variety of formats, including images, videos, files, and database retrieval results, by maintaining a unified, huge hash table in memory. The simple thing is to call the data into memory and then read it from memory, which greatly improves the reading speed.
features
Basic Architecture
Cache data Expiration policy
LRU (least recently used) expiration policy, when storing data items within memcached, you can specify the expiration time of the cache, which is permanent by default. When the memcached server runs out of allocations, the invalidated data is replaced first, and then the data that is not used recently.
Data obsolescence Internal implementation
Lazy elimination Mechanism: each time the cache into the data, there will be a period, in the read time and set the time to do a TTL comparison to determine whether the expiration
Distributed cluster Implementation
There is no "distributed" function on the server side. Each server is a fully independent and isolated service. Memcached distributed, is implemented by the client program
RedisBasic Introduction
Redis is a remote in-memory database (non-relational database) with strong performance, unique data model with replication characteristics and problem solving. It can store the mappings between key-value pairs and 5 different types of values, persist the data stored in memory to the hard disk, use the replication feature to extend read performance, and Redis can also use client shards to extend write performance. Built-in replication (replication), LUA scripting (LUA scripting), LRU driver events (LRU eviction), transactions (transactions), and different levels of disk persistence (persistence), and Redis Sentinel (Sentinel) and automatic partitioning (Cluster) provide high availability (HI availability).
Data Model
Data Retirement Strategy
Data obsolescence Internal implementation
Persistence mode
partial parsing of the underlying implementation
Pictures from csdn Blogger-god restricted area, if there is any fairy who knows what drawing software to draw the welcome comment, I also want to know.
Part of the process diagram that starts
Partial operation diagram of server-side persistence
Underlying hash table implementation (progressive rehash)
Initialize Dictionary
New Dictionary element plot
Rehash Execution Process
Caching Design Principles
comparison of Redis and memcached
Parsing the cache schema in a Java Distributed System (top)