Varnish is a high-performance, open-source reverse proxy server and cache server. Varnish uses memory cache files to reduce response time and network bandwidth consumption. The project was initiated by a Norwegian newspaper Verdens Gang's network branch, whose architecture design and development director Poul-henning Kamp is one of the core FreeBSD developers, The initial project management and infrastructure and additional development was provided by a Norwegian Linux consulting firm, Linpro.
When it comes to varnish,squid, I have to mention. Squid is an ancient cache server. Because varnish advanced design concept, performance than squid high, varnish can also be managed by the port, using regular statements to clear the function of the specified cache, these squid can not be done. But varnish in high concurrency, the resource consumption is high, and the varnish service process crashes, restarts, and the cached data in memory is lost.
1. Varnish architecture
Varnish is a service project based on modern device design, so only 64-bit systems are supported. The Manager process handles the request task, ensuring that each task is assigned a worker threads. So varnish is a heavy-threading service. In addition, the manager process includes functions that accept CLI command control, including tuning of run parameters and VCL configuration updates. Initializes the Subprocess Cacher process and detects cacher online or not at a certain frequency.
Cacher Process Features:
- Listening for client requests
- Managing Worker Threads
- Storing cached data
- Logging traffic logs
- Update counter values based on statistics
Varnish uses workspaces to reduce the scramble that occurs when each thread needs to request or modify memory. Varnish has multiple workspaces, the most important of which is the session workspace, which is used to maintain session-related data.
In terms of logging, Cacher process uses the VSL mechanism to handle this, a shared memory space that effectively reduces record blocking. The log space is divided into two sections, which record the formatted request log and the counter values. You can view, analyze, or permanently store logs by varnish your own log tool.
2. Varnish cache storage mechanism (Storage Types):
2.1 Malloc[,size]
Call malloc () to allocate memory space for the cache, which inevitably results in fragmented files, which can consume additional memory
[, size] is used to define the size of the space, and all cache entries fail after reboot;
2.2 File[,path[,size[,granularity]]
Varnish creates a file to store the cached data and then maps the file to a memory space, but the file does not persist the data and all cache entries fail after the restart;
Granularity Increment size
2.3 persistent,path,size
Persistent file storage, black box; All cache entries are valid after reboot, but in experimental stage, more problems;
2.4 MSE
Massive Storage Engine, available in the plus version, means a fee. This mode is designed with a capacity of up to 100TB and disk performance better than file mode.
Summary: When memory space is not sufficient to store all cached data, you should select file or MSE storage. So it is generally configured as file storage, and of course it is better to use MSE if paid.
3. Varnish program Environment
This document host environment is Centos7.2,varnish version 4.0
Varnish The program Environment:
/etc/varnish/varnish.params: Configure the operating characteristics of the varnish service process, such as the listening address and port, the caching mechanism;
/ETC/VARNISH/DEFAULT.VCL: Configure the working properties of each child/cache thread;
Main program:
/usr/sbin/varnishd
CLI Interface:
/usr/bin/varnishadm
Shared Memory Log Interactive tool:
/usr/bin/varnishhist
/usr/bin/varnishlog
/usr/bin/varnishncsa
/usr/bin/varnishstat
/usr/bin/varnishtop
Test Tool Program:
/usr/bin/varnishtest
VCL configuration file Overloading program:
/usr/sbin/varnish_reload_vcl
Systemd Unit File:
/usr/lib/systemd/system/varnish.service #varnish服务
/usr/lib/systemd/system/varnishlog.service #logger Daemon
/usr/lib/systemd/system/varnishncsa.service #lgger daemon in Apache format
3.1 Options for VARNISHD main program:
Systemd mode to start the varnish service, the main program specifies the configuration file is:/etc/varnish/varnish.params
-A address[:p ort][,address[:p ort][...], default to 6081 ports; #对客户端开放的监听端口地址
-T address[:p ORT], default to 6082 ports; #管理工具连接的端口地址
-S [name=]type[,options], defining the cache storage mechanism; #可以多次定义此项
-U user
-G Group
-F CONFIG:VCL configuration file;
-F: running in the foreground;
...
Thread-dependent parameters:
Within the online pool, each request is handled by a thread, and the maximum number of worker threads determines the concurrency response of the varnish;
Thread_pools:number of worker thread pools. The number of thread pools, the default value is 2, the official introduction of 2 thread pool is sufficient, and then increase the value does not improve the effect;
Thread_pool_max:the maximum number of workers threads in each pool. Create the maximum threads per thread pool; Default 5000
Thread_pool_min:the minimum number of worker threads in each pool. Each thread pool maintains a minimum number of threads; the extra meaning is "maximum number of idle threads"; default 100
So the parameters we often need to adjust are thread_pool_max,thread_pool_min.
Calculate varnish maximum number of concurrent connections =thread_pools * Thread_pool_max
Thread_pool_timeout: Thread idle time, exceeding threshold destroys thread
Thread_pool_add_delay: Delay time to create a new thread with a default value of 0s
Thread_pool_destroy_delay: Destroys the delay time of a thread, the default value is 2s;
How to set it up:
Run dynamic modification via Varniadm interface settings
Command: Param.set
Permanent and effective Method:
Run-time parameters:/etc/varnish/varnish.params file, deamon_opts
-P Param=value: Set the operating parameters and their values, can be reused multiple times;
-R Param[,param ...]: Sets the specified parameter to read-only state
Example: daemon_opts= "-P thread_pool_min=2-p thread_pool_max=10000-p thread_pool_timeout=300"
3.2 Varnish Management Tools
Usage: varnishadm-s/etc/varnish/secret-t [Address:]port
A connection key is specified, which is generated when the varnish is installed. Specifies the port address of the management interface, which is omitted by default 127.0.0.1.
After entering, enter help [command] for assistance
Grooming common directives
Configuration file-Related:
Vcl.list: View VCL List
Vcl.load: Load, load and compile;
Vcl.use: Activation;
Vcl.discard: delete;
Vcl.show [-v] <configname>: View details of the specified configuration file;-V option to view default VCL code
Run-time Parameters:
Param.show-l: Display list;
Param.show <PARAM>
Param.set <PARAM> <VALUE> Setting parameters
Cache storage:
Storage.list
Back-end servers:
Backend.list
4, the basis of VCL
Varnish Configuration Language (VCL) is a dynamic language used to describe request processing and to develop a caching strategy. The VCL configuration content is converted from the VCC subprocess created by the manager process to the C language code, which is then compiled into a shared object via GCC and then loaded into the cacher process.
To write a good VCL configuration, you need to understand the varnish internal message processing flow, the core key word is the finite state machine--limited status engine. For a simple processing process:
The ellipse in the figure represents the state engine. These state engines are conceptualized as sub-functions in VCL, starting with the Vcl_ prefix, where the HTTP header or other aspects of each request can be checked or modified in the engine. The return (action) code indicates that a state is interrupted, where action is the VCL keyword, which points to which state engine to go next.
Each request is handled separately, and there are correlations between States, but they are isolated from each other.
Before you get to the next step in understanding VCL configuration code, look at the underlying concepts behind VCL. When varnish processes a request, it is the first to parse the request. From the HTTP header to analyze the type of the request, determine whether it is a valid request method, etc., when the basic parsing is completed, according to the first policy check and then make a judgment. VCL is a variety of actions based on the rules made up of each strategy.
Can be divided into two areas: Front frontend and back end backend
The front-end status can be divided into four stages:
First stage:
Vcl_recv #接受客户端请求, make judgments
Phase II:
Vcl_hash #进行hash计算, not interpreted, calculated and sent to each third-stage state engine
Phase III:
Vcl_hit #缓存命中, to this process
Vcl_pass #缓存跳过
Vcl_miss #缓存未命中
Vcl_purge #清理缓存
Vcl_pipe #对于无法识别的http首部请求直接送入管道, not processed by backend processing
Stage four:
Vcl_deliver: Most of the requests that respond to the client are sent back
Vcl_synth: Accept tasks from Vcl_purge, delete processing for the specified cache
The back-end state is divided into two stages:
First stage:
Vcl_backend_fetch: Accept tasks from the front-end state Vcl_pass or Vcl_miss, back-end host requests
Phase II:
Vcl_backend_response: Accept the back end to return the normal status of the message, whether the cache check, need to cache the response cache, do not need to cache, and finally sent to Vcl_deliver
Vcl_backend_error: Back-end host error, return error response
In addition, there are two special State engines:
Vcl_init: The VCL code to be executed before any request is processed: primarily for initializing vmods;
Vcl_fini: All requests are ended and are called when the VCL configuration is discarded; mainly used to clean up the vmods;
5. VCL syntax
A major premise:varnish version 4.0, VCL has its own default rule, it cannot be removed, and is always appended to the custom rule.
(1) VCL configuration file begins with VCL 4.0;
(2) C language Annotation style://, # and/* foo */;
(3) Child functions are declared using the Sub keyword, such as sub Vcl_recv {...} ;
(4) No cycle, state-limited variables (limited by the engine's built-in variables);
(5) using return (action) to interrupt the engine state, pointing to the next processing, the action is the keyword, for example: return (pass);
(6) can be loaded dynamically;
5.1 Three main types of syntax:
Sub subroutine {
...
}
If CONDITION {
...
} else {
...
}
Return (), Hash_data ()
5.2 Built-in functions and keywords
Function:
Hash_data (): Indicates the data of the hash calculation and reduces the difference to increase the hit ratio;
Regsub (str,regex,sub): Replace Str with a regex for the first time to match the string to a sub; primarily for URL Rewrite
Regsuball (str,regex,sub): replace each regex in Str with a sub for each match to the string;
Return ():
Ban (expression)
Ban_url (regex): Bans all of the cached objects whose URLs can be matched to by the regex here;
Synth (status, "STRING"): Purge operation;
Key words:
Call subroutine, return (action), New,set,unset
Smart for a specified function in a specific child function
Operator:
= =,! =, ~, >=, <, <=
logical operators: &&, | |,!
Variable Assignment: =
Regular match: ~
(? i) indicates ignoring case
Also note the matching rules if the string requires "" to cause
5.3 Variable Type:
Built-in variables:
Req.*:request, indicating that the request message sent by the client is related;
req.http.*
Req.http.user-agent, Req.http.Referer, ...
Bereq.*: Related to httpd request sent by varnish to be host;
bereq.http.*
BERESP.*: The response message from the be host response to the varnish is related;
beresp.http.*
resp.*: By varnish response to the client related;
Obj.*: The properties of cached objects stored in cache space;
Common variables:
bereq.*, req.*:
Bereq.http.HEADERS
Bereq.request: Request method;
Bereq.url: URL of the request;
Bereq.proto: the requested protocol version;
Bereq.backend: Indicates the back-end host to invoke;
Req.url: Requested URL
Req.http.Cookie: The value of the Cookie header in the client's request message;
Req.http.user-agent: Browser type
beresp.*, resp.*:
Beresp.http.HEADERS
Beresp.status: The status code of the response;
Reresp.proto: protocol version;
Host name of the Beresp.backend.name:BE host;
The Beresp.ttl:BE host responds to the contents of the remaining cacheable duration;
obj.*
Obj.hits: The number of times this object was hit from the cache;
Obj.ttl: The TTL value of the object
Server.*
Server.ip
Server.hostname
client.*
Client.ip
Also note that variables are restricted by state, as available tables
User definable:
Set Variable=value #定义变量
unset variable #撤销定义的变量
6, VCL Configuration Example 6.1 response header add a cache whether the hit field X-cache
~]$ VI/ETC/VARNISH/DEFAULT.VCL
Sub Vcl_deliver {if (obj.hits>0) {Set Resp.http.x-cache = "hit via" + Server.ip; } else {Set resp.http.x-cache = "MISS via" + Server.ip; }}
~]$ VARNISH_RELOAD_VCL #重载vcl
Or use Varnishadm to enter the management interface, use the following command
~]$ vcl.load test1 default.vcl #装载vcl, and specifies a named test1
If the status code 200 is returned, the syntax is correct and the compiler passes
~]$ vcl.use test1 #如所示vcl新配置已生效
Test with the Curl command, no cache for the first time, Miss
Second access, already cached, hit
6.2 Forcing a request for a class of resources does not check the cache:
Set access to directories under/login or/admin Any file does not query the cache
VCL_RECV {if (req.url ~ "(? i) ^/(login|admin/)") { return (pass);}}
6.3 For certain types of resources, such as public pictures, etc., cancel the cookie and forcibly set the length of time it can be cached by varnish;
Sub Vcl_backend_response {
if (Beresp.http.cache-control!~ "S-maxage") {if (bereq.url ~ "(? i) \. ( JPG|JPEG|PNG|GIF|CSS|JS) {unset beresp.http.set-cookie;set beresp.ttl = 3600s;}}
}
6.4 Pruning of Cached objects: purge, ban
(1) can perform purge operation
Sub Vcl_purge {
Return (synth, "purged");
}
(2) When to perform purge operations
Sub Vcl_recv {
if (Req.method = = "PURGE") {
return (purge);
}
...
}
The definition above is simple, anyone can clean the cache, and the following restrictions are based on the IP address
Add access control rules for this type of request:
ACL Purgers {
"127.0.0.0"/8;
"10.1.0.0"/16;
}
Sub Vcl_recv {
if (Req.method = = "PURGE") {
if (!client.ip ~ purgers) {#这里正则匹配的时acl列表, no quotation marks required
Return (Synth (405, "Purging not allowed for" + Client.ip)); #来自不属于acl定义的purgers组的purge请求则返回错误代码
}
return (purge);
}
...
}
6.5 Setting the use of multiple back-end hosts
Backend Default {
. Host = "172.16.100.6";
. Port = "80";
}
Backend Appsrv {
. Host = "172.16.100.7";
. Port = "80";
}
Sub Vcl_recv {
if (req.url ~ "(? i) \.php") {
Set req.backend_hint = Appsrv; #php资源转发至appsrv处理
} else {
Set req.backend_hint = default;
}
...
}
7. Define back-end server group 7.1 Define back-end server group
The module needs to be imported in the VCL configuration before use:
Import Director;
Example:
Import directors; # Load the Directors
Backend Server1 {
. Host =
. Port =
}
Backend Server2 {
. Host =
. Port =
}
Sub Vcl_init {#在init defined in child functions
New group_name = Directors.round_robin (); #创建组, and named Group_name, specifying the scheduling method
Group_name.add_backend (Server1); #为组添加服务器成员
Group_name.add_backend (Server2);
}
Sub Vcl_recv {
# Send all traffic to the bar Director:
Set req.backend_hint = Group_name.backend (); #组引用方法
}
7.2 Back-end host health detection mechanism
Varnish can perform health checks on back-end hosts, dynamically remove or restore back-end host scheduling lists
. Probe: Define health state detection methods;
. URL: The URL of the request when detected, the default is "/";
. Request: a specific request made;
. Request =
"GET/.healthtest.html http/1.1"
"Host:www.magedu.com"
"Connection:close"
. window: Judging its health based on the number of recent checks;
. threshhold: Most recently, the number of times defined by the. Threshhold is successful in this check, as defined in window.
. Interval: Detection frequency;
. Timeout: timeout time;
. Expected_response: Expected response code, default is 200;
How to configure health status detection:
(1) Probe pb_name = {}
Backend NAME = {
. Probe = Pb_name;
...
}
(2) Backend NAME {
. Probe = {
...
}
}
Example:
Probe Check {#probe defined first.
. url = "/.healthcheck.html";
. window = 5;
. threshold = 4;
. interval = 2s;
. timeout = 1s;
}
Backend Default {
. Host = "10.1.0.68";
. Port = "80";
. probe = check; #引用检测方式
}
Backend Appsrv {
. Host = "10.1.0.69";
. Port = "80";
. probe = check;
}
Viewing detection status in the Varniadm command interface
8. Varnish Log
Used to view the shared Memory log log tool
8.1 Varnishstat-varnish Cache Statistics View
The default is dynamic refresh display mode
Options
-1 Print Current statistic results
-F Filed_name Displays statistics for the specified field
-L: A list of field names that can be specified for the-f option;
Example
# varnishstat-1-F main.cache_hit-f Main.cache_miss
8.2 Varnishtop-varnish Log Field rank
Default Dynamic Update
Options:
-1 Printing Current rankings
-I taglist displays the specified field rank. You can use multiple-I options at the same time, or one option to keep up with multiple tabs "," delimited;
-I <[taglist:]regex> regular display fields
-X taglist: Exclusion list
-X <[taglist:]regex> regular exclusion fields
8.3 Varnishlog-display Varnish Logs
Displays the share memory in the log record
8.4 varnishncsa-display Varnish Logs in APACHE/NCSA combined log format
Displays log records in share memory, Apache log format
This article ends here
Finish
Learn Varnish Essays