Introduction
Varnish is a high-performance and open-source reverse proxy server and HTTP accelerator. It adopts a brand new software system and works closely with the current hardware system. Compared with the traditional squid, varnish has many advantages such as higher performance, faster speed, and more convenient management;
Currently, the latest version is 4.0.0, while the 3.x version is also a stable version that can be used in the production environment. However, the 2.x version in the yum source is outdated and is not recommended;
Comparison between varnish and squid
Similarities
Advantages of varnish
Varnish is highly stable. when both of them work at the same load, the probability of squid Server failure is higher than that of varnish, because squid must be restarted frequently;
Varnish provides faster access because the "Visual page cache" technology is used to read all cached data directly from the memory and squid is read from the hard disk. Therefore, varnish provides faster access speed;
Varnish can support more concurrent connections, because varnish TCP connections are released faster than squid, so more TCP connections can be supported in High-concurrency connections;
Varnish can use regular expressions to batch clear some caches by managing ports, but squid cannot;
Squid is a single process that uses a single CPU, but varnish opens multiple processes in the fork form for processing, so it can reasonably use all cores to process the corresponding requests;
Varnish disadvantage
Once the varnish process hang, crash, or restart, the cached data will be completely released from the memory. At this time, all requests will be sent to the backend server, which puts a lot of pressure on the backend server in high concurrency;
In varnish usage, if a request from a single URL passes Load Balancing such as HA/F5, each request falls into a different varnish server, causing the request to be penetrated to the backend; in addition, the same request is cached on multiple servers, which results in a waste of varnish cache resources and performance degradation;
Solutions to varnish weaknesses
Disadvantage 1: It is recommended to use varnish memory cache mode to start when the traffic is large, and multiple squid servers need to be followed. To prevent the previous varnish server and server from being restarted, a large number of requests penetrate varnish, so that squid can act as the second-level cache, it also makes up for the varnish cache to be released when it is restarted in the memory;
Disadvantage 2: You can perform URL hash on Server Load balancer to send a single URL request to a varnish server;
Main improvements compared with varnish 3.x
Stream objects are fully supported;
Attackers can obtain invalid objects in the background, that is, client/backend separation;
The new vanishlog Query Language allows automatic request grouping;
Complex request timestamps and byte counts;
Security improvements;
Changes in VCL syntax
The VCL configuration file must specify the version: Enter VCL 4.0 in the first line of the VCL file;
The vcl_fetch function is replaced by vcl_backend_response, and Req. * is no longer applicable to vcl_backend_response;
The back-end source server group Director becomes the varnish module and needs to be imported to the directors before being defined in the vcl_init subroutine;
A custom subroutine (that is, a sub) cannot start with VCL _. Call sub_name;
The error () function is replaced by synth;
Return (lookup) is replaced by return (hash;
Use beresp. uncacheable to create the hit_for_pss object;
The variable Req. backend. healty is replaced by STD. Healthy (req. backend;
The variable Req. backend is replaced by Req. backend_hint;
Keyword remove is replaced by unset;
See: https://www.varnish-cache.org/docs/4.0/whats-new/index.html#whats-new-index
Architecture and File Cache Workflow
Varnish is divided into Master process and child process;
The master process reads the storage configuration file, calls the appropriate storage type, creates/reads cache files of the corresponding size, the master initializes the structure for managing the storage space, and then fork and monitors the child process;
During the initialization of the main thread, the child process maps the previously opened storage file to the memory. At this time, the idle structure is created and initialized, and the storage management structure is mounted to for allocation;
There are three types of external management interfaces: command line interface, telnet interface, and Web interface;
At the same time, the configuration modified during the running process can be compiled into C language by the VCL compiler, and organized into a shared object (shared object) for loading and using by the child process;
The child process allocates several threads for work, including some management threads and many worker threads, which can be divided:
Accept thread: receives the request and attaches the request to the overflow queue;
Work threads: there are multiple threads that are responsible for removing requests from the overflow queue, processing the requests until they are completed, and then processing the next request;
Epoll thread: the processing of a request is called a session. During the session period, after processing the request, it is handed over to epoll to monitor whether there are other events;
Expire thread: For cached objects, it is organized into a binary heap based on the expiration time. This thread regularly checks the root of the heap and processes expired files, delete or retrieve expired data;
Basic HTTP request processing process
Varnish processes HTTP requests as follows:
Receive status (vcl_recv): indicates the entry status of request processing. Based on VCL rules, it is determined whether the request should pass (vcl_pass) or pipe (vcl_pipe), or enter Lookup (local query );
Lookup status: after entering this status, data is searched in the hash table. If it is found, it enters the hit (vcl_hit) status; otherwise, it enters the Miss (vcl_miss) status;
Pass (vcl_pass) Status: In this status, it directly enters the backend request, that is, the fetch (vcl_fetch) status;
Fetch (vcl_fetch) Status: In the fetch status, the backend obtains the request, sends the request, obtains the data, and stores the data locally according to the settings;
Deliver (vcl_deliver) Status: Send the obtained data to the client, and then complete the request;
Note: In varnish4, vcl_fetch is slightly different. vcl_backend_fetch and vcl_backend_responseare independent functions;
Built-in functions (also called Child routines)
Vcl_recv: used to receive and process requests. When a request arrives and is successfully received, it is called to determine how to process the request by judging the request data;
Vcl_pipe: This function is called when it enters the pipe mode. It is used to directly transmit requests to the backend host and return the original backend response to the client;
Vcl_pass: This function is called when it enters the pass mode. It is used to pass requests directly to the backend host, but the response of the backend host is not cached and directly returned to the client;
Vcl_hit: After the lookup command is executed, the function is automatically called after the request content is found in the cache;
Vcl_miss: This method is automatically called when no request content is found in the cache after the lookup command is executed. This function can be used to determine whether content needs to be obtained from the backend server;
Vcl_hash: This function is called when vcl_recv creates a hash value for the request. This hash value is used as the key for searching cache objects in varnish;
Vcl_purge: This function is called after the pruge operation is executed and can be used to build a response;
Vcl_deliver: This method is called before the request content found in the cache is sent to the client;
Vcl_backend_fetch: Call this function before sending a request to the backend host to modify the request sent to the backend;
Vcl_backend_response: after obtaining the response from the backend host, you can call this function;
Vcl_backend_error: This function is called when obtaining the source file from the backend host fails;
Vcl_init: This function is called when VCL is loaded and is often used to initialize the varnish module (vmod)
Vcl_fini: this function is often used to clear the varnish module when all requests leave the current VCL and the current VCL is discarded;
Built-in public variables in VCL
Applicability of variables (also called objects)
Note: There are slight discrepancies in some areas. For details, refer to the official documentation;
Variable types
Req: The request object, variable available when the request arrives
Bereq: the backend request object, which is a variable available for requests to the backend host.
Beresp: the backend response object, which is a variable available for obtaining content from the backend host.
Resp: the HTTP Response object, available variables for the client response
OBJ: available variables related to object attributes stored in memory
Specific variables see: https://www.varnish-cache.org/docs/4.0/reference/vcl.html#reference-vcl
Garce Mode)
Merge requests in varnish
When several clients request the same page, varnish sends only one request to the backend server, then suspends several other requests and waits for the returned results, other requests are then copied to the backend and sent to the client;
But if there are thousands of requests at the same time, the waiting queue will become large, which will lead to two potential problems:
To solve this problem, you can configure varnish to keep the cached object for a period of time after it expires to return the previous file content (stale content) to the waiting requests ), the configuration example is as follows:
Sub vcl_recv {If (! Req. backend. healthy) {set req. grace = 5 m;} else {set req. grace = 15 s ;}} sub vcl_fetch {set beresp. grace = 30 m;} # The above configuration indicates that varnish will retain invalid cache objects for another 30 minutes. This value is equal to the maximum req. the grace value is enough. # varnish can provide expired content within 5 minutes or 15 seconds to the front-end requests based on the health status of the backend host.
Install configurations
# Installation package: http://repo.varnish-cache.org/redhat/varnish-4.0/el6/yum localinstall -- nogpgcheck varnish-4.0.0-1.el6.x86_64.rpm varnish-libs-4.0.0-1.el6.x86_64.rpm varnish-docs-4.0.0-1.el6.x86_64.rpmvi/etc/sysconfig/varnish # edit the configuration file, modify the following item varnish_storage_size = 100 m # This value is adjusted according to your own situation, the value varnish_storage = "malloc, $ {varnish_storage_size}" # malloc (memory) is used by default in varnish 4; Service varnish start # Starts varnish, default listening port 6081 for external requests, Management port 6082, and backend host 127.0.0.1: 80 ============ varnishadm-S/etc/Varnish/secret-T 127.0.0.1: 6082 # log on to the management command line varnish> VCL. list # list all configurations of varnish> VCL. load test1 test. VCL # Load and compile the new configuration. test1 is the configuration name, test. VCL is the configuration file varnish> VCL. use test1 # when using the configuration, you must specify the configuration name. The current configuration is used as the last VCL. use is varnish> VCL. show test1 # display configuration content. You must specify the configuration name.
Instance resolution
# This is an example VCL file for varnish. # It does not do anything by default, delegating control to the # builtin VCL. the builtin VCL is called when there is no explicit # Return Statement. # See the VCL chapters in the users guide at https://www.varnish-cache.org/docs/# and http://varnish-cache.org/trac/wiki/VCLExamples For more examples. # marker to tell the VCL compiler that this VCL has been Adapted to the # New 4.0 format. VCL 4.0; import directors; probe backend_healthcheck {# create a health check. url =/health.html ;. window = 5 ;. threshold = 2 ;. interval = 3 S;} backend web1 {# create a backend host. host = "static1.lnmmp .com ";. port = "80 ";. probe = backend_healthcheck;} backend web2 {. host = "static2.lnmmp .com ";. port = "80 ";. probe = backend_healthcheck;} backend img1 {. host = "img1.lnmmp .com ";. port = "80 ";. Probe = backend_healthcheck;} backend img2 {. host = "img2.lnmmp .com ";. port = "80 ";. probe = backend_healthcheck;} vcl_init {# create a backend Host group, that is, directors new web_cluster = directors. random (); web_cluster.add_backend (web1); web_cluster.add_backend (web2); New img_cluster = directors. random (); img_cluster.add_backend (img1); img_cluster.add_backend (img2);} ACL purgers {# define accessible source IP address "127.0.0.1"; "192.168.0.0"/2 4;} sub vcl_recv {If (req. request = "get" & req. HTTP. cookie) {# The GET request with the cookie header also caches return (hash);} If (req. URL ~ "Test.html") {# Return (PASS);} If (req. Request = "purge") {# process the purge request if (! Client. ip ~ Purgers) {return (Synth (405, "method not allowed");} return (hash);} If (req. HTTP. x-forward-For) {# Add X-forward-for header set req for requests sent to the backend host. HTTP. x-forward-for = req. HTTP. x-forward-for + "," + client. IP;} else {set req. HTTP. x-forward-for = client. IP;} If (req. HTTP. host ~ "(? I) ^ (www .)? Lnmmp .com $ ") {# distribute data to different backend host groups based on different access domain names. HTTP. host = "www.lnmmp .com"; Set req. backend_hint = web_cluster.backend ();} elsif (req. HTTP. host ~ "(? I) ^ images.lnmmp .com $ ") {set req. backend_hint = img_cluster.backend () ;}return (hash) ;}sub vcl_hit {# processing of purge requests if (req. request = "purge") {purge; Return (Synth (200, "purged") ;}} sub vcl_miss {# processing of purge requests if (req. request = "purge") {purge; Return (Synth (404, "not in cache") ;}} sub vcl_pass {# processing of purge requests if (req. request = "purge") {return (Synth (502, "purge on a passed object") ;}} sub vcl_ B Ackend_response {# customize the cache duration of the cached file, that is, the TTL value if (req. url ~ "\. (JPG | JPEG | GIF | PNG) $") {set beresp. TTL = 7200 s;} If (req. url ~ "\. (HTML | CSS | JS) $ ") {set beresp. TTL = 1200 s;} If (beresp. HTTP. set-cookie) {# define that the backend response with the set-Cookie header is not cached and return directly to the client return (deliver);} sub vcl_deliver {If (obj. hits> 0) {# Add the X-Cache header to the response to show whether the cache hits set resp. HTTP. x-Cache = "hit from" + server. IP;} else {set resp. HTTP. x-Cache = "miss ";}}
Varnish 4.0 (transfer)