1, Haproxy introduction
The HAProxy is a high-performance tcp/http reverse proxy load Balancer server with the following features:
Complete HTTP request forwarding based on statically assigned cookies
Load balancing between multiple servers and session stickiness based on HTTP cookies
Primary and Standby server switchover
Accept access to specific ports for service monitoring
Enables smooth shutdown of a service without interrupting a request response that has established a connection, rejecting a new request
Add, modify, or delete header information in a request or response HTTP message
Blocking requests based on regular rules
Provides a Service Status report page with a user authentication mechanism
Haproxy is especially useful for Web sites that are heavily loaded, and often require session-hold or seven-tier processing. Haproxy runs on today's hardware and can support tens of thousands of concurrent connections. and its operating mode makes it easy and safe to integrate into your current architecture, while protecting your Web server from being exposed to the web.
HAProxy implements an event-driven, single-process model that supports very large numbers of concurrent connections. A multi-process or multithreaded model is rarely capable of handling thousands of concurrent connections because of memory limitations, System scheduler restrictions, and ubiquitous lock limits. The event-driven model does not have these problems because it implements all of these tasks on the client side (User-space) with better resource and time management. The disadvantage of this model is that, on multicore systems, these programs often have poor extensibility. That's why they have to be optimized so that each CPU time slice (Cycle) does more work.
Haproxy the actual work, it occupies user space time is 20 times times less than the kernel running time, so it is necessary to tune the system parameters is a very important task.
In addition to measuring a load balancer server, the main consideration is three indicators
Session Rate
This indicator is very important and it determines whether a load balancer can distribute all accepted requests. This metric is usually determined by CPU performance. The size of the measurement indicator is related to the size of each object transmitted, usually tested with empty objects, Session rates around 100,000 sessions/s, and Xeon E5 in 2014.
Session concurrency
The indicator is associated with the previous indicator. This metric is related to server memory and the number of file descriptors that the system can handle. Usually each session occupies 34KB, that is, about 3W sessions occupy 1GB memory space, in fact, the socket buffer will also occupy memory space, 2W session socket occupied 1GB memory.
Data forwarding Rate
This indicator is opposed to session rate, which is typically measured in megabytes/s (MB/s), or gigabits/s (Gbps). Transferring large objects facilitates the elevation of the indicator, because large object transfers can reduce the time wasted on session creation and shutdown. The measurement session rate is advantageous to the indicator promotion when the small object is transmitted. Haproxy used the Xeon E5 test scores of up to a-Gbps in 2014.
2. Haproxy Program Environment
This article environment: CentOS7.2 haproxy 1.5 via Yum installation
Program Environment:
configuration file:/etc/haproxy/haproxy.cfg
Unit File:haproxy.service
Main program:/usr/sbin/haproxy
configuration file:
Global: Configuration segment
process and security configuration related parameters
performance Tuning related parameters
Debug related parameters
proxies: Proxy configuration segment
defaults: For Frontend, Backend and listen provide default configuration;
frontend: Front end, equivalent to server{in Nginx ...} ;
backend: Back end, equivalent to Nginx in the upstream {... };
Listen: direct combination of front and back;
* * about the relationship between the frontend and the backend: a front end can point to multiple back ends, while a backend can be called multiple.
3, haproxy configuration detailed
3.1 Global Configuration segment
3.1.1 Process-related configuration
Defining log system-related properties
Log <address> [Len <length>] <facility> [Max level [min level]]
Harpoxy sends the log to the specified Rsyslog server, and the Rsyslog service is also opened in local records;
A maximum of two log servers can be configured on the global side;
< address>: Log server address
[Len] Specifies the maximum log length for a record
Define run user, owning group
Username
Group GroupName
Operating mode
means background daemon 3.1.2 Parameters Tuning
Maxconn <number>: Sets the maximum number of concurrent connections for a single haproxy process;
maxconnrate <number>: Sets the number of connections per second that a single haproxy process receives;
Maxsslconn <NUMBER>: Set the maximum number of concurrent connections for SSL connections for a single haproxy process,
maxsslrate <number>: Maximum
creation rate of SSL connections for single haproxy processes; Spread-checks <0..50, in Percent>: Avoid the simultaneous problems caused by back-end detection,
set stagger time ratio, range 0 to 50, general settings 2-5 is better.
3.1.3 User List
User authentication for the Haproxy Status monitoring page. Define at least one user list and add a user
Passwords can be encrypted or plaintext.
Example:
UserList L1
Group G1 users Tiger,scott
group G2 users xdb,scott
user tiger password $6$k6y3o.ep$jlkqe4 (...) xhswrv6j.c0/d7cv91
user Scott Insecure-password elgato
user xdb insecure-password Hello
userlist L2
Group G1
Group G2
user Tiger password $6$k6y3o.ep$jlkbx (...) xhswrv6j.c0/d7cv91 groups G1
user Scott Insecure-password elgato groups g1,g2
user xdb Insecure-password Hello g Roups G2
3.2 Proxy configuration section
This part of the configuration is used under the following definition areas
-Defaults < name >
-frontend < name >
-Backend < name >
-Listen < name &G T
The "defaults" zone defines the default parameters for Frontend,backend,listen
The "frontend" area describes the listening configuration for receiving client requests
The "backend" area describes the backend server configuration that accepts request processing
The "Listen" area describes a set of front-end and back-end group configurations that are directly bound to one another
HAProxy configuration keyword and zone restriction feature, that is, some keywords are not available in a region
Here's how to get started with keyword Usage 3.2.1 Common configuration directives
1. Bind [<address>]:<port_range> [, ...] [param*]
Used only in frontend and listen areas. Defining parameters such as Service listener port address
[param*] Parameters depending on the system, generally do not need to specify
Example
bind:80 #监听本机所有IP的80端口
bind *:80 #监听本机所有IP的80端口
bind 192.168.12.1:8080,10.1.0.12:8090
2. Mode {Tcp|http|health}
TCP: Based on LAYER4 implementation agent, can proxy most TCP-based application layer protocols, such as SSH/MYSQL/PGSQL, etc.;
http: The HTTP request of the client will be deeply parsed;
Health: Work for a healthy state check response mode, when the request arrives only respond "OK" that is disconnected;
3. Balance <algorithm> [<arguments>]
balance Url_param <param> [Check_post]
Defining scheduling algorithms in the backend region
< algorithm > as follows:
Roundrobin
Polling scheduling algorithm with weights,
using weight to define weights after server
, dynamic algorithms: Support for run-time tuning of weights, slow start-up (slow reception of large numbers of requests at startup), support for up to 4,095 back-end active hosts only
Static-rr
The static roundrobin algorithm;
Run-time adjustments and slow start of weights are not supported, but the number of backend hosts is unlimited;
Leastconn
A dynamic algorithm of least connection assignment with weights;
For long-connected application protocols such as SSH
First
First priority algorithm;
If the first server can accept the request, it will always assign the connection to it until the first server is busy, assigned to the next, ordered by the service-side digital ID from small to large
Source
Source IP hash algorithm;
The algorithm ensures that requests from the same client IP can be assigned to the same server in the case of no reduction or increase of the back-end servers;
This algorithm is suitable for use in TCP mode that cannot be inserted using cookies
Dynamic algorithm or static algorithm depends on Hash-type;
Uri
URI hash algorithm;
The algorithm hashes the URI to the left part of the query tag, or specifies the hash URI when the whole parameter is specified;
This algorithm ensures that requests that access the same URI are assigned to the same server, which is suitable for the backend cache servers to improve the cache hit ratio;
Dynamic algorithm or static algorithm depends on Hash-type;
Additionally: the algorithm supports additional parameters [< arguments >]:
(1) Whole:hash full URI
(2) Len Number:hash Specifies the length of the URI
(3) Depth Nubmer:hash Specifies the depth of the directory, each "/" represents a depth
Uri_param
param hash Algorithm;
The value of the specified parameter in the < param > section in the URL of the user request (the "=" part of the URI) is calculated as hash;
The algorithm is applicable to the URI with the user identification parameter, it guarantees the same user ID's request to assign to the same service side;
If the Check_post identity is enabled, the "?" is not found in the URI. parameter, a parameter declaration is found for the HTTP Post request entity;
Dynamic algorithm or static algorithm depends on Hash-type;
Example:
Balance Url_param userid
balance Url_param session_id check_post 64
HDR (< name >)
HTTP header field hash algorithm;
The specified HTTP header will be fetched for hash calculation. If there is no value, the polling schedule is reduced;
Dynamic algorithm or static algorithm depends on Hash-type;
4. Hash_type < method >
The hash-related algorithm selected in the balance directive will be affected by this.
The default method to take is map-based
< method > is as follows:
Map-based: Modulo method, hash data structure is static array;
The hash is static, does not support online adjustment weights, does not support slow start;
The algorithm is smooth and the backend server can withstand the load evenly.
The disadvantage is also obvious: when the total weight of the server changes, that is, the server online or offline, will result in the overall change in scheduling results. The consistent method should be used if this situation is to be avoided;
Consistent: Consistent hash, hash data structure is "tree";
The hash is dynamic, supports on-line weight adjustment, supports slow start
Each server appears multiple times in the tree, finds the hash key in the tree, and selects the nearest server;
The advantage of this method is that when the total weight of the server is changed, the effect on the scheduling result is local and will not cause big change. So it is very suitable for cache server;
Disadvantage: The algorithm is not smooth enough, it is easy to cause the back-end server load imbalance. Therefore, it is necessary to the server's weight or server ID to adjust;
To maintain a uniform load, all server IDs should be kept consistent;
5. Server <name> <address>[:[port] [param*]
default-server [param*]
The server is used to define a host in backend and listen;
Default-server is used to set default parameters for the server;
[param*] as follows:
Weight < weight;: The weight of the current server;
ID < number >: Set server ID
cookie < value;: Specifies its cookie value for the current server, which is detected when the request message is received, and its function is to implement cookie-based session retention;
Check: Health status detection for current server;
Inter < delay;: time interval;
Rise < count;: The number of times to be detected for the "health" status, default 2;
Fall < count;: The number of times to be tested for "unhealthy" status, default 3;
Addr <ipv4|ipv6>: Address used to detect health status;
Ports < port;: Used for health status detection;
Note: The default is Transport layer detection, that is, the detection port can respond, need to perform application layer detection, you need Httpchk, Smtpchk, Mysql-check, Pgsql-check, Ssl-hello-chk;
Maxconn <maxconn>: Maximum number of concurrent connections for the current server;
Maxqueue <maxqueue>: Maximum length of the waiting queue for the current server;
Disabled: Mark the host as unavailable;
Redir <prefix>: All requests to the current server for the get and head classes are redirected to the specified URL;
Examples:
server first 10.1.1.1:1080 ID 3 Cookie First check inter maxconn 10000
Maxqueue R second 10.1.1.2:1080 ID 4 cookie second check Inter 1000
6. Option httpchk
option httpchk <uri>
option httpchk <method> <uri>
option Httpchk < Method> <uri> <version>
Based on the HTTP protocol, the 7-layer health state detection mechanism is based on TCP layer, which is detected by default.
TCP mode can also use this detection mechanism
< method > < URI > < version;: The super-start line of the request message;
method defaults to options; Return status code 2xx,3xx means success;
Examples:
# Relay HTTPS traffic to Apache instance and check service availability
# using HTTP request "Options * http/1.1 "on port.
Backend https_relay
mode TCP
option httpchk options/index.html http/1.1\r\nhost:\ www
server apache1 192.168.1.1:443 Check Port 80
7. Http-check expect [!] <match> <pattern>
Define the detection of effective expectations;
! Represents the identified error value;< match > Desirable values are:
Status < String >
Rstatus < regex > Regular mode
String < string >
Rstring < regex >
Examples:
# Only accept status as valid
Http-check expect status @ Consider
SQL errors as Errors
Http-check expect! String sql\ Error
# Consider status 5xx only as errors
Http-check expect! rstatus ^5
# Check and we have a C Orrect hexadecimal tag before/html
Http-check expect rstring <!--tag:[0-9a-f]*
8. Cookies <name> [rewrite | insert | prefix] [indirect] [nocache] [postonly] [Preserve] [HttpOnly] [sec URE] [domain <domain>]* [maxidle <idle>] [Maxlife <life>]
Enable Cookie-based session stickiness to be implemented in conjunction with the cookie parameters specified by the server;
Common form: Cookie websrv insert NoCache Indirect
Example:
backend websrvs
balance roundrobin
cookie websrv Insert NoCache indirect
Server Web1 10.1.0.68:80 Check weight 2 maxconn cookies web1
server web2 10.1.0.69:80 Check weight 1 maxconn COO Kie WEB2
9. Default_backend <backend>
When the usage rules for use_backend are not matched, the default server group is specified by Default_backend;
The subsequent use of use_backend will be explained in the ACL section; 3.2.2 Log related
Define logging mechanisms for frontend or backend;
Log global : Using globally defined logging methods log
<address> [Len <length>] <facility> [<level> [< MINLEVEL>]: Custom
No log: Do not record
capture request header <name> Len <length>
Record the value of the specified header in the request message in the log; Len is used to specify the length of information to be recorded;
capture response header <name> Len <length>--
Record the value of the specified header in the response message in the log; Len is used to specify the length of information to be logged;
example:
capture request header Referer Len 30
3.2.3 Custom Error page
-ErrorFile <code> <file>
< code > Specifies the status code returned by HTTP. 403, 408, 502, 503, and 504 can be used;
< file > Specify a document instead of the HTTP response;
Example:errorfile 503/etc/haproxy/errorfiles/503sorry.http