Rotten mud: keyword introduction of high load balancing learning haproxy, load balancing haproxy

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This document consistsIlanniwebProviding friendship sponsorship, first launched in the dark world

In the previous article, we briefly explained the installation and Setup of haproxy. In this article, we will introduce the keywords used in the haproxy configuration file one by one.

Follow me ilanniweb

1Keyword balance

Balance is used to define the load balancing algorithm and can be used in ults, listen, and backend.

The balance method is as follows:

Balance <algorithm> [<arguments>]

Balance url_param <param> [check_post [<max_wait>]

<Algorithm> this parameter is used to select a server in the server Load balancer scenario. It is only used when persistent information is unavailable or a connection needs to be re-distributed to another server.

Balance supports the following algorithms:

Roundrobin: Round Robin Based on weights. This is the most balanced and fair algorithm when server processing time remains evenly distributed. This algorithm is dynamic, which means its weight can be adjusted at runtime. However, in design, each backend server can only accept a maximum of 4128 connections.

Source: hash the requested source address and distribute the weights of the backend servers to a matching server. This allows requests from the same client IP address to be distributed to a specific server. However, when the total weight of the server changes, for example, if a server is down or a new server is added, many client requests may be distributed to different servers than previous requests. It is often used for TCP-based protocol with no cookie function of Server Load balancer. The default value is static, however, you can also use hash-type to modify this feature.

Static-rr: Round Robin Based on weights. Similar to roundrobin, but it is a static method, adjusting the server weight during running does not take effect. However, there is no limit on the number of backend server connections.

Leastconn: New connection requests are distributed to backend servers with the minimum number of connections. This algorithm is recommended in scenarios with long sessions, such as LDAP and SQL, it is not suitable for application-layer protocols with short sessions, such as HTTP. This algorithm is dynamic and can adjust its weight at runtime.

Uri: performs a hash operation on the left half of the URI (the part before the question mark) or the whole URI, and distributes the hash operation to a matching server after the total weight of the server is exceeded; this can make requests to the same URI always be distributed to a specific server, unless the total weight of the server has changed; this algorithm is often used for proxy caching or anti-virus proxy to increase the cache hit rate. It must be noted that this algorithm is only applicable to HTTP backend server scenarios. Its default value is static algorithm, however, you can also use hash-type to modify this feature.

Url_param: parameters specified for the URL through <argument> will be retrieved in each http get request. If a specified parameter is found and it is assigned a value through the equal sign "=, the value will be hash and distributed to a matching server after being divided by the total weight of the server; this algorithm can track the user ID in the request to ensure that requests of the same user ID will be sent to the same specific server, unless the total weight of the server has changed; if a request does not contain a specified parameter or does not have a valid value, the call wheel algorithm is used to schedule the request. This algorithm is static by default, however, it can also use hash-type to modify this feature.

Hdr (<name>): For each HTTP request, the HTTP header specified by <name> is retrieved. If the corresponding header does not appear or has no valid value, the call wheel algorithm is used to schedule the corresponding request. It has an optional option "use_domain_only", which can be used to calculate only the domain name portion when retrieving headers similar to Host classes (for example, through www.ilanni.com, calculate only the hash value of the ilanni string) to reduce the calculation workload of the hash algorithm. This algorithm is static by default, but it can also be modified using hash-type;

2, Keyword bind

Bind can only be used for frontend and listen segments. It is used to define one or more listening intercept.

Bind is used as follows:

Bind [<address>]: <port_range> [,...]

Bind [<address>]: <port_range> [,...] interface <interface>

<Address>: Optional. It can be a host name, IPv4 address, IPv6 address, or *. If this option is omitted, * or 0.0.0.0, all IPv4 addresses of the current system are monitored.

<Port_range>: it can be a specific TCP port or a port range (for example, 5005-5010). The proxy server receives client requests through the specified port.

Note that the intercept <address: port> of each group of listeners can only be used once on the same instance, and ports smaller than 1024 must have specific permissions before they can be used, this may need to be defined by the uid parameter.

<Interface>: Specifies the name of the physical interface, which can only be used on Linux. Instead of using an interface alias, you can only use the physical interface name and only manage physical interfaces that have the permission to be bound.

3Keyword mode

Mode is used to set the instance running mode or protocol. When implementing content exchange, the frontend and backend must work in the same mode (generally HTTP mode). Otherwise, the instance cannot be started.

Mode can be used with the listen, ults, frontend, and backend segments.

The mode is used as follows:

Mode {tcp | http | health}

Tcp: The instance runs in pure TCP mode (Layer 4). A full-duplex connection will be established between the client and the server without any type of check on Layer 7 packets; this is the default mode and is usually used for SSL, SSH, SMTP, and other applications.

Http: The instance runs in HTTP mode (Layer 7). client requests are analyzed in depth before being forwarded to the backend server. All requests that are not compatible with the RFC format are rejected.

Health: The instance works in health mode. It only responds to the "OK" message to the inbound request and closes the connection. It does not record any log information; this mode is used to respond to external component health check requests. Currently, this mode has been deprecated because the monitor keyword in tcp or http mode can perform similar functions;

4, Keyword hash-type

The hash-type definition is used to map hash codes to backend servers. It cannot be used in frontend segments. Available methods include map-based and consistent, the default map-based method is recommended in most scenarios.

Use hash-type as follows:

Hash-type <method>

Map-based: the hash table is a static array containing all online servers. Its hash value will be very smooth, and the weight will be taken into account in the column. However, as a static method, the weight adjustment of the online server will not take effect, which means it does not support slow start. In addition, the selection server is based on its location in the array. Therefore, when a server is down or a new server is added, most connections will be reassigned to a different server. This method is not suitable for Cache Server scenarios.

Consistent: the hash table is a tree structure filled by Servers. When you search for the corresponding server in the hash tree based on the hash key, the nearest server will be selected. This method is dynamic and supports Server weight modification during runtime. Therefore, it is compatible with the features of slow startup. When a new server is added, only a small number of requests are affected. Therefore, it is especially applicable to the scenario where the backend server is a cache. However, this algorithm is not smooth, and requests distributed to servers may not achieve an ideal balance. Therefore, you may need to adjust the weights of servers from time to achieve a better balance.

5Keyword log

Log Enables event and traffic logs for each instance, so it can be used for all segments. A maximum of two log parameters can be specified for each instance. However, if "log global" is used and two log parameters have been set for "global", excessive log parameters will be ignored.

Log global

Log <address> <facility> [<level> [<minlevel>]

Global: This format is used when the log system parameters of the current instance are the same as those defined in the "global" section. Each instance can only define one "log global" statement, and it does not have any additional parameters.

<Address>: Define the location where the log is sent. One of the formats can be <4244_address: PORT>. The port is a udp port. The default value is 514; the second format is the path of the Unix intercept file, but pay attention to the read and write permissions of the chroot application and the user.

<Facility>: one of the standard facility of the syslog system.

<Level>: defines the log level, that is, the output information filter. The default value is all information. When the level is specified, all log information equal to or higher than this level will be sent.

6, Keyword maxconn

Maxconn specifies the maximum number of concurrent connections at the front end. Therefore, it cannot be used in the backend segment.

Use maxconn as follows:

Maxconn <conns>

For large sites, this value can be increased as much as possible to allow haproxy to manage the connection queue, thus avoiding failure to respond to user requests. Of course, the maximum value cannot exceed the definition in the "global" section. In addition, it is important to note that haproxy will maintain two buffers for each connection. The size of each buffer is 8 KB. In addition, with other data, each connection will occupy approximately 17 kb of RAM space. This means that after proper optimization,-concurrent connections can be maintained when 1 GB of available RAM space is available.

If an excessively large value is specified for <conns>, in extreme scenarios, the final occupied space may exceed the available memory of the current host, which may lead to unexpected results. Therefore, set an acceptable value to a wise decision. It is 2000 by default.

7Keyword: default_backend

Default_backend defines the default backend server used by the instance when there is no matching use_backend rule. Therefore, it cannot be applied to the backend segment. When frontend and backend are used for content exchange, use-backend is usually used to define the matching rule. Requests not matched by the rule will be received by the backend server specified by this parameter.

Default_backend is used as follows:

Default_backend <backend>

<Backend>: Specify the backend name.

Use Case:

Use_backend dynamic if url_dyn

Use_backend static if url_css url_img

Default_backend dynamic

8Keyword server

The server declares a server for the backend. Therefore, it cannot be used in the defaults and frontend segments.

Server usage:

Server <name> <address> [: port] [param *]

<Name>: the internal name specified for this server appears in the log and warning information. If http-send-server-name is set, it will also be added to the request header sent to this server.

<Address>: For the IPv4 address of this server, you can use a DNS host name or a domain name, but you only need to resolve the host name to the corresponding IPv4 address at startup.

[: Port]: Specifies the target port when the connection request is sent to this server, which is optional; if not set, the same port is used when the client requests.

[Param *]: A series of parameters set for this server. There are many available parameters. For details, refer to the instructions in the official documentation. The following only describes several common parameters;

Server or default server parameters:

Backup: set as a backup server. Other servers in the server Load balancer scenario cannot be used to enable this server.

Check: Start the health check for this server. You can use other parameters to perform more precise settings, such:

Inter <delay>: sets the interval for health check, in milliseconds. The default value is 2000. You can also use fastinter and downinter to optimize the delay Based on the server status;

Rise <count>: set the number of times an offline server needs to be successfully checked during health check from offline to normal;

Fall <count>: the number of times the server needs to be checked to switch from normal to unavailable;

Cookie <value>: Set the cookie value for the specified server. The value specified here will be checked when the request is sent. The server selected for this value for the first time will be selected in subsequent requests, its purpose is to implement the persistent connection function;

Maxconn <maxconn>: specifies the maximum number of concurrent connections accepted by this server. If the number of connections sent to this server is greater than the value specified here, it will be placed in the Request queue, wait for other connections to be released;

Maxqueue <maxqueue>: sets the maximum length of the Request queue;

Observe <mode>: checks whether the server is healthy by observing the communication status of the server. The default value is disabled. The supported types include "layer4" and "layer7 ", "layer7" can only be used in http Proxy scenarios;

A standard use case is as follows:

Server web1 192.168.5.171: 8080 maxconn 1024 weight 3 check inter 2000 rise 2 fall 3

The preceding parameters are frequently used in [param.

Redir <prefix>: Enable the redirection function to send GET and HEAD requests sent to this server to respond with a 302 status code. Note that/cannot be used after the prefix /, the relative address cannot be used to avoid loops. For example:

Server srv1 192.168.5.174: 80 redir http://imags.ilanni.com check

Weight <weight>: weight. The default value is 1. The maximum value is. 0 indicates that the Server Load balancer instance is not involved.

Check Method:

Option httpchk

Option httpchk <uri>

Option httpchk <method> <uri>

Option httpchk <method> <uri> <version>: cannot be used for frontend segments.

For example:

Backend https_relay

Mode tcp

Option httpchk OPTIONS * HTTP/1.1 \ r \ nHost: \ www.ilanni.com

Server apache1 192.168.1.1: 443 check port 80

9Keyword stats enable

Stats enable enables statistical reports set by default during program compilation. stats enable cannot be used in frontend segments. As long as there are no other settings, they will use the following Configuration:

Stats uri:/stats

Stats realm: "HAProxy Statistics"

Stats auth: no authentication

Stats scope: no restriction

Although stats enable can enable statistical reports one by one, it is recommended to set all other parameters so as not to rely on default settings but to bring unexpected consequences. The following is a configuration case.

Backend public_www

Server websrv1 192.168.5.171: 80

Stats enable

Stats hide-version

Stats scope

Stats uri/stats

Stats realm Haproxy \ Statistics

Stats auth admin: password

10Keyword stats hide-version

Stats hide-version hides the haproxy version in the statistical report and cannot be used in frontend segments. By default, the statistics page displays some useful information, including the haproxy version number.

However, it is highly risky to disclose the accurate version of haproxy to everyone because it helps malicious users quickly identify version defects and vulnerabilities.

11Keyword stats realm

Stats realm enables statistical reports and high-precision authentication fields, and cannot be used in "frontend" segments.

Stats realm <realm>

Haproxy treats it as a word when reading realm. Therefore, any blank word character in the middle must be escaped using a backslash. This parameter is only valid when configured with stats auth.

<Realm>: indicates the domain name displayed in the browser for basic HTTP authentication. It is used to prompt the user to enter a user name and password.

12Keyword stats scope

Stats scope enables statistical reports and limits the report segments. It cannot be used in frontend segments.

When this statement is specified, the statistical report only displays the report information of the listed sections, and the information of all other sections will be hidden. To display statistical reports for multiple segments, this statement can be defined multiple times. Note that the section name check is only performed by comparing strings. It does not actually check whether the specified section actually exists.

Stats scope {<name> | "."}

<Name>: The name of a listen, frontend, or backend segment. "." indicates the current segment defined by the stats scope statement.

13Keyword stats auth

Stats auth enables the statistical report feature with authentication and authorizes a user account. It cannot be used in frontend segments.

Stats auth <user >:< passwd>

<User>: the user name authorized for access;

<Passwd>: the user's access password, in plaintext format;

This statement enables the statistical report function based on the default settings and only allows user access defined by it. It can also be defined multiple times to authorize multiple user accounts. You can provide a domain description when prompted for user authentication based on the "stats realm" parameter. When an illegal user accesses the statistics function, it will respond to a "401 Forbidden" page. The authentication method is HTTP Basic authentication, and password transmission is carried out in plaintext mode. Therefore, the configuration file also uses plaintext storage to indicate that the non-confidential information cannot be the same as the passwords of other key accounts.

14Keyword stats admin

Stats admin enables the management level function of the statistics report page when the specified conditions are met. It allows you to enable or disable servers through web interfaces. However, for security reasons, the statistical report page should be read-only as much as possible. In addition, if the HAProxy multi-process mode is enabled, enabling this management level may cause abnormal behavior.

Stats admin {if | unless} <cond>

Currently, the POST request method is restricted from using only the buffer minus the reserved space. Therefore, the server list cannot be too long. Otherwise, the request will not work properly. Therefore, we recommend that you adjust only a few servers at a time. The following are two cases. The first limit is that the management level function can only be enabled when the report page is opened on the local machine, and the second defines that Only Authenticated Users are allowed to use the management level function.

Backend stats_localhost

Stats enable

Stats admin if LOCALHOST

Backend stats_auth

Stats enable

Stats auth admin: password

Stats admin if TRUE

15, Keyword option httplog

Option httplog enables the ability to record HTTP requests, session statuses, and timers.

Option httplog [clf]

Clf: uses the CLF format to replace the default HTTP format of HAProxy. This format is usually used only when a specific log analyzer that supports CLF format is used.

By default, the log input format is very simple, because it only includes the source address, target address, and Instance name, and the "option httplog" parameter will enrich the log format, it generally includes but is not limited to HTTP requests, connection timers, session statuses, connections, captured headers and cookies, "frontend", "backend", and server names, of course, it also includes the source address and port number.

16, Keyword option logasap

Option logasap

Enable logging of HTTP requests in advance and cannot be used in backend segments.

By default, an HTTP request is recorded at the end of the request so that the overall transmission duration and number of terms can be recorded in the log. Therefore, when a large object is uploaded, the log recording time may be slightly delayed. The option logasap parameter can instantly record logs when the server sends the complete header, but does not record the overall transmission duration and number of terms. In this case, it is a good choice to capture the Content-Length response header to record the number of words transmitted. The following is an example.

Listen http_proxy 0.0.0.0: 80

Mode http

Option httplog

Option logasap

Log 172.16.100.9 local2

17, Keyword option forwardfor

Option forwardfor allows you to insert the "X-Forwarded-For" header in the request header sent to the server. That is, the function of obtaining the real IP address of the client is enabled.

Option forwardfor [Rule T <network>] [header <name>] [if-none]

<Network>: an optional parameter. When specified, this function is disabled when the source address matches requests on this network.

<Name>: an optional parameter. You can use a custom header, such as "X-Client", to replace "X-Forwarded-". Some unique web servers do need to be used for a unique header.

If-none: This header is added to the request message only when it does not exist.

Haproxy works in reverse proxy mode. The client IP address in the request sent to the server is the address of the haproxy host rather than the address of the real client, which makes the log information on the server unable to record the real request source, the Header "X-Forwarded-For" can be used to solve this problem. Haproxy can add this header to each request sent to the server and use the Client IP address as its value.

Note that haproxy works in tunnel mode and only checks the first request of each connection. Therefore, only the first request packet is appended with this header. If you want to append this header to each request, make sure that you use the Options "option httpclose", "option forceclose", and "option http-server-close.

The following is an example.

Frontend www

Mode http

Option forwardfor partition t 127.0.0.1

18Keyword errorfile

Errorfile returns a page file to the client rather than the error code generated by haproxy when the user requests a page that does not exist. It can be used in all segments.

Errorfile <code> <file>

<Code>: Specifies the HTTP status codes returned to the specified page. The available status codes include 200, 400, 403, 408, 500, 502, 503, and 504.

<File>: Specifies the page file used for response.

For example:

Errorfile 400/etc/haproxy/errorpages/400badreq. http

Errorfile 403/etc/haproxy/errorpages/403forbid. http

Errorfile 503/etc/haproxy/errorpages/503sorry. http

19Keyword errorloc and errorloc302

Errorloc <code> <url>

Errorloc302 <code> <url>

When a request error occurs, an HTTP redirection to a URL is returned. It can be used in all configuration segments.

<Code>: Specifies the HTTP status codes returned to the specified page. The available status codes include 200, 400, 403, 408, 500, 502, 503, and 504;

<Url>: The specific path of the page Location specified in the Location header. It can be a relative path of the page on the current server or an absolute path. Note that, if a specific status code is generated when the URI itself is incorrect, it may lead to cyclic targeting;

It should be noted that these two keywords will return the 302 status, which will allow the client to use the same HTTP method to obtain the specified URL, for non-GET (such as POST) Scenarios) the returned client URL is not allowed to use methods other than GET. If this problem exists, you can use errorloc303 to return the 303 status code to the client.

20Keyword errorloc303

Errorloc303 <code> <url>

When a request error occurs, an HTTP redirection to a URL is returned to the client, which can be used in all configuration segments.

<Code>: Specifies the HTTP status codes returned to the specified page. The available status codes include 400, 403, 408, 500, 502, 503, and 504;

For example:

Backend webserver

Server 172.16.100.6 172.16.100.6: 80 checkmaxconn 3000 cookie srv01

Server 172.16.100.7 172.16.100.7: 80 check maxconn 3000 cookie srv02

Errorloc 403/etc/haproxy/errorpages/sorry.htm

Errorloc 503/etc/haproxy/errorpages/sorry.htm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More