High concurrency system current limit stunt 2

Source: Internet
Author: User
Tags current time error code error status code lua sleep

In the previous article, "High concurrency system limited-current stunt", the current limit algorithm, application-level current limit, and distributed current limit are introduced in this paper. The Access layer current limit implementation is described.

Access Layer Current limit

Access layer usually refers to the entrance of the request traffic, the main purpose of this layer is: load balancing, illegal request filtering, request aggregation, cache, downgrade, current limit, A/B testing, quality of service monitoring and so on, you can refer to the author of the "Use Nginx+lua (openresty) to develop high-performance Web applications."

For Nginx access layer current limit can be used to bring two modules: The connection number of the current limit module ngx_http_limit_conn_module and the leaky bucket algorithm to implement the request current-limiting module ngx_http_limit_req_module. More complex current-limiting scenarios can also be performed using the LUA current-limiting module lua-resty-limit-traffic provided by Openresty.

The limit_conn is used to limit the total number of network connections that a key corresponds to, and can be limited to streams such as IP and domain name dimensions. The limit_req is used to limit the average rate of requests for a key, and there are two uses: Smooth mode (delay) and allow burst mode (Nodelay).

Ngx_http_limit_conn_module

Limit_conn is a current limit for the total number of network connections that a key corresponds to. You can limit the total number of connections to an IP dimension by IP, or limit the total number of connections for a domain name by service domain name. But remember that not every request connection is counted by the counter, only those requests that have been processed by Nginx and have read the entire request header will be counted by the counter.

Example configuration:

================================

HTTP {

Limit_conn_zone$binary_remote_addr zone=addr:10m;

Limit_conn_log_level error;

Limit_conn_status 503;

...

server {

...

Location/limit {

Limit_conn addr 1;

}

================================

Limit_conn: To configure the shared memory region of the key and counter and the maximum number of connections for the specified key, the maximum number of connections specified here is 1, which indicates that Nginx handles up to 1 connections concurrently;

Limit_conn_zone: The size of the shared memory area used to configure the current limit key and the corresponding information of the key, where the key is "$binary _remote_addr" which represents the IP address, or can be used such as $server_ Name as key to limit the maximum number of connections at the domain level;

Limit_conn_status: Configure the status code returned after the current limit, return 503 by default;

Limit_conn_log_level: Configures the log level after the record is limited to the stream, the default error level.

the main execution process for Limit_conn is as follows:

1, the request to enter after the first judge the current limit_conn_zone the corresponding key of the number of connections exceeded the configured maximum number of connections;

2.1, if the maximum size of the configuration is exceeded, the current is limited, and the error status code defined by Limit_conn_status is returned;

2.2, otherwise the corresponding key connection number plus 1, and registration request processing completed callback function;

3, the request processing;

4. In the end request phase, the registered callback function is called to reduce the number of connections to the corresponding key by 1.

Limt_conn can limit the total number of concurrent/requests for a key, and key can vary as needed.

example of the number of concurrent connections by IP limit configuration:

First define the current-limiting area of the IP dimension:

================================

Limit_conn_zone $binary _remote_addrzone=perip:10m;

================================

Then add the current-limiting logic to the location where you want to limit the flow:

================================

Location/limit {

Limit_conn Perip 2;

echo "123";

}

================================

That is, the maximum number of concurrent connections allowed per IP is 2.

Tested with the AB test tool, the number of concurrent 5, the total number of requests is 5:

================================

Ab-n 5-c 5 Http://localhost/limit

================================

The following access.log output will be obtained:

================================

[08/jun/2016:20:10:51+0800] [1465373451.802] 200

[08/jun/2016:20:10:51+0800] [1465373451.803] 200

[08/jun/2016:20:10:51 +0800] [1465373451.803] 503

[08/jun/2016:20:10:51 +0800] [1465373451.803] 503

[08/jun/2016:20:10:51 +0800] [1465373451.803] 503

================================

Here we set the Access log format to Log_format main ' [$time _local] [$msec] $status ', respectively, "date date second/Millisecond value response status code".

If the stream is limited, you will see something like this in Error.log:

================================

2016/06/08 20:10:51 [ERROR] 5662#0: *5limiting connections by zone "Perip", client:127.0.0.1, Server: _,request: "Get/li MIT http/1.0 ", Host:" LocalHost "

================================

To limit the number of concurrent connections by domain name sample configuration:

First define the current limit area of the domain name dimension:

================================

Limit_conn_zone $ server_name zone=perserver:10m;

================================

Then add the current-limiting logic to the location where you want to limit the flow:

================================

Location/limit {

Limit_conn PerServer 2;

echo "123";

}

================================

That is, the maximum number of concurrent request connections allowed per domain name is 2, so the configuration can limit the maximum number of connections for the server.

Ngx_http_limit_req_module

The limit_req is a token bucket algorithm implemented to limit the request to the specified key, such as limiting the request rate by IP dimension.

Example configuration:

================================

HTTP {

Limit_req_zone $binary _remote_addr zone=one:10m rate=1r/s;

Limit_conn_log_level error;

Limit_conn_status 503;

...

server {

...

Location/limit {

Limit_req Zone=one burst=5 Nodelay;

}

================================

Limit_req: Configure current limit area, bucket capacity (burst capacity, default 0), delay mode (default delay);

Limit_req_zone: Configure the current limit key, and the size of the shared memory region that holds key correspondence information, the fixed request rate, the key specified here is "$binary _remote_addr" for the IP address, and the fixed request rate is configured with the speed parameter, which supports 10r/ S and 60r/m, which is 10 requests per second and 60 requests per minute, but will eventually be converted to a fixed request rate per second (10R/S handles a request every 100 milliseconds; 60r/m, which processes a request every 1000 milliseconds).

Limit_conn_status: Configure the status code returned after the current limit, return 503 by default;

Limit_conn_log_level: Configures the log level after the record is limited to the stream, the default error level.

The main execution process for Limit_req is as follows:

1, the request to enter the first to determine the last request time relative to the current time (the first is 0) whether to require a current limit, if a current limit is required to perform step 2, otherwise perform step 3;

2.1. If the bucket capacity (burst) is not configured, the bucket capacity is 0, the request is processed at a fixed rate, and the corresponding error code (default 503) is returned directly if the request is limited to a stream;

2.2, if the bucket capacity (burst>0) and delay mode (no configuration Nodelay), if the bucket is full, then the new incoming request is restricted stream, if not full the request will be processed at a fixed average rate (at a fixed rate and defer processing requests as required, delay the use of sleep implementation);

2.3, if the bucket capacity (burst>0) and non-delay mode (configured with Nodelay), does not process the request at a fixed rate, but allows burst processing requests, if the bucket is full, then the request is restricted stream, directly return the corresponding error code;

3. If there is no current limit, the request is processed normally;

4, Nginx will be at the appropriate time to select some (3 nodes) limit current key to expire processing, memory recovery.

Scenario 2.1 Test

First define the current-limiting area of the IP dimension:

================================

Limit_req_zone $binary _remote_addrzone=test:10m rate=500r/s;

================================

Limited to 500 requests per second, with a fixed average rate of 2 milliseconds a request.

Then add the current-limiting logic to the location where you want to limit the flow:

================================

Location/limit {

Limit_req zone=test;

echo "123";

}

================================

That is, the bucket capacity is 0 (burst defaults to 0) and the delay mode.

Tested with the AB test tool, the number of concurrent 2, the total number of requests is 10:

================================

Ab-n 10-c 2 Http://localhost/limit

================================

The following access.log output will be obtained:

================================

[08/jun/2016:20:25:56+0800] [1465381556.410] 200

[08/jun/2016:20:25:56 +0800] [1465381556.410] 503

[08/jun/2016:20:25:56 +0800] [1465381556.411] 503

[08/jun/2016:20:25:56+0800] [1465381556.411] 200

[08/jun/2016:20:25:56 +0800] [1465381556.412] 503

[08/jun/2016:20:25:56 +0800] [1465381556.412] 503

================================

Although 500 requests per second are allowed, because the bucket capacity is 0, incoming requests are either processed or restricted, unable to be deferred, and the average rate is about 2 milliseconds. For example, 1465381556.410 and 1465381556.411 were processed; some friends would say that the fixed average rate is not 1 milliseconds, in fact, this is because the implementation algorithm is not so accurate.

If the current limit is in Error.log, you will see the following:

================================

2016/06/08 20:25:56 [ERROR] 6130#0: *1962limiting requests, excess:1.000 by zone "Test", Client:127.0.0.1,server: _, req Uest: "Get/limit http/1.0", Host: "LocalHost"

================================

If it is delayed, you will see the following in Error.log (log level to info level):

================================

2016/06/10 09:05:23 [warn] 9766#0: *97021delaying request, excess:0.368, by zone "test", Client:127.0.0.1,server: _, req Uest: "Get/limit http/1.0", Host: "LocalHost"

================================

Scenario 2.2 Test

First define the current-limiting area of the IP dimension:

================================

Limit_req_zone $binary _remote_addr zone=test:10m rate=2r/s;

================================

To facilitate testing, set the rate to 2 requests per second, that is, a fixed average rate of 500 milliseconds a request.

Then add the current-limiting logic to the location where you want to limit the flow:

================================

Location/limit {

Limit_req zone=test burst=3;

echo "123";

}

================================

The fixed average rate is 500 milliseconds for a request with a capacity of 3, and if the bucket is full of new requests being restricted, you can queue up in the bucket and wait (implement delay mode).

To see the current limiting effect, we wrote a req.sh script:

================================

Ab-c 6-n 6 Http://localhost/limit

Sleep 0.3

Ab-c 6-n 6 Http://localhost/limit

================================

The first 6 concurrent requests 6 times the URL, then sleeps 300 milliseconds, then carries on 6 concurrent requests 6 times the URL, the intermediate sleep is to be able to see the effect across 2 seconds, if does not see the following effect can adjust the sleep time.

The following access.log output will be obtained:

================================

[09/jun/2016:08:46:43+0800] [1465433203.959] 200

[09/jun/2016:08:46:43 +0800] [1465433203.959] 503

[09/jun/2016:08:46:43 +0800] [1465433203.960] 503

[09/jun/2016:08:46:44+0800] [1465433204.450] 200

[09/jun/2016:08:46:44+0800] [1465433204.950] 200

[09/jun/2016:08:46:45 +0800] [1465433205.453] 200

[09/jun/2016:08:46:45 +0800] [1465433205.766] 503

[09/jun/2016:08:46:45 +0800] [1465433205.766] 503

[09/jun/2016:08:46:45 +0800] [1465433205.767] 503

[09/jun/2016:08:46:45+0800] [1465433205.950] 200

[09/jun/2016:08:46:46+0800] [1465433206.451] 200

[09/jun/2016:08:46:46+0800] [1465433206.952] 200

================================

The bucket capacity is 3, that is, the bucket in the time window to flow up to 3 requests, and at a fixed rate of 2r/s processing requests (that is, processing a request every 500 milliseconds), Bucket calculation time window (1.5 seconds) = rate (2R/S)/bucket Capacity (3), that is, in this time window bucket up to a maximum of 3 requests. So we're going to push forward 1.5 seconds and 1 seconds in the current time to calculate the total number of requests in the time window, and because the default is delay mode, the request in the time window is staged into the bucket and the request is processed at a fixed average rate:

First round: There are 4 requests processed successfully, according to the bucket capacity should be up to 3; This is because the problem of the calculation algorithm, the first calculation because there is no reference value, so after the first calculation, the subsequent calculation can have a reference value, so the first success can be ignored; This problem is very small to ignore and processing requests at a fixed 500 millisecond rate.

Second round: Because the first round of requests came in bursts, Almost all at 1465433203.959 time, just because the leaky bucket has smoothed the rate into a fixed average rate (one request per 500 milliseconds), while the second round of calculation time should be based on 1465433203.959, while the second burst request is almost at 1465433205.766 time, so the calculation bucket The capacity of the time window should be based on 1465433203.959 and 1465433205.766来 calculation, the result is 1465433205.766 this time the leaky bucket is empty, can flow into the bucket of 3 requests, other requests are rejected, and because the first round of the last processing time is 1465433205.453 , so the second round of the first request was delayed to 1465433205.950. It is also important to note that the fixed average rate is only around the configured rate, and there are some deviations in the computational accuracy problem.

If the bucket capacity is changed to 1 (burst=1), execute the req.sh script to see the following output:

================================

[09/jun/2016:09:04:30+0800] [1465434270.362] 200

[09/jun/2016:09:04:30 +0800] [1465434270.371] 503

[09/jun/2016:09:04:30 +0800] [1465434270.372]503

[09/jun/2016:09:04:30 +0800] [1465434270.372] 503

[09/jun/2016:09:04:30 +0800] [1465434270.372] 503

[09/jun/2016:09:04:30+0800] [1465434270.864] 200

[09/jun/2016:09:04:31 +0800] [1465434271.178] 503

[09/jun/2016:09:04:31 +0800] [1465434271.178] 503

[09/jun/2016:09:04:31 +0800] [1465434271.178] 503

[09/jun/2016:09:04:31 +0800] [1465434271.178] 503

[09/jun/2016:09:04:31 +0800] [1465434271.179] 503

[09/jun/2016:09:04:31+0800] [1465434271.366] 200

================================

The bucket capacity is 1 and requests are processed at a fixed average rate of one request per 1000 milliseconds.

Scenario 2.3 Test

First define the current-limiting area of the IP dimension:

================================

Limit_req_zone $binary _remote_addrzone=test:10m rate=2r/s;

================================

To facilitate a test configuration of 2 requests per second, the fixed average rate is 500 milliseconds for a request.

Then add the current-limiting logic to the location where you want to limit the flow:

================================

Location/limit {

Limit_req zone=test burst=3 Nodelay;

echo "123";

}

================================

The bucket capacity is 3, and if the bucket is full and rejects the new request directly, and 2 a maximum of two requests per second, the bucket processes the request in Nodelay mode at a fixed 500 millisecond rate.

To see the current limit effect, we wrote a req.sh script:

================================

Ab-c 6-n 6 Http://localhost/limit

Sleep 1

Ab-c 6-n 6 Http://localhost/limit

Sleep 0.3

Ab-c 6-n 6 Http://localhost/limit

Sleep 0.3

Ab-c 6-n 6 Http://localhost/limit

Sleep 0.3

Ab-c 6-n 6 Http://localhost/limit

Sleep 2

Ab-c 6-n 6 Http://localhost/limit

================================

A access.log output similar to the following will be obtained:

================================

[09/jun/2016:14:30:11+0800] [1465453811.754] 200

[09/jun/2016:14:30:11+0800] [1465453811.755] 200

[09/jun/2016:14:30:11+0800] [1465453811.755] 200

[09/jun/2016:14:30:11+0800] [1465453811.759] 200

[09/jun/2016:14:30:11 +0800] [1465453811.759] 503

[09/jun/2016:14:30:11 +0800] [1465453811.759] 503

[09/jun/2016:14:30:12+0800] [1465453812.776] 200

[09/jun/2016:14:30:12+0800] [1465453812.776] 200

[09/jun/2016:14:30:12 +0800] [1465453812.776] 503

[09/jun/2016:14:30:12 +0800] [1465453812.777] 503

[09/jun/2016:14:30:12 +0800] [1465453812.777] 503

[09/jun/2016:14:30:12 +0800] [1465453812.777] 503

[09/jun/2016:14:30:13 +0800] [1465453813.095]503

[09/jun/2016:14:30:13 +0800] [1465453813.097] 503

[09/jun/2016:14:30:13 +0800] [1465453813.097] 503

[09/jun/2016:14:30:13 +0800] [1465453813.097] 503

[09/jun/2016:14:30:13 +0800] [1465453813.097] 503

[09/jun/2016:14:30:13 +0800] [1465453813.098] 503

[09/jun/2016:14:30:13+0800] [1465453813.425] 200

[09/jun/2016:14:30:13 +0800] [1465453813.425] 503

[09/jun/2016:14:30:13 +0800] [1465453813.425] 503

[09/jun/2016:14:30:13 +0800] [1465453813.426] 503

[09/jun/2016:14:30:13 +0800] [1465453813.426] 503

[09/jun/2016:14:30:13 +0800] [1465453813.426] 503

[09/jun/2016:14:30:13+0800] [1465453813.754] 200

[09/jun/2016:14:30:13 +0800] [1465453813.755] 503

[09/jun/2016:14:30:13 +0800] [1465453813.755] 503

[09/jun/2016:14:30:13 +0800] [1465453813.756] 503

[09/jun/2016:14:30:13 +0800] [1465453813.756] 503

[09/jun/2016:14:30:13 +0800] [1465453813.756] 503

[09/jun/2016:14:30:15+0800] [1465453815.278] 200

[09/jun/2016:14:30:15+0800] [1465453815.278] 200

[09/jun/2016:14:30:15+0800] [1465453815.278] 200

[09/jun/2016:14:30:15 +0800] [1465453815.278] 503

[09/jun/2016:14:30:15 +0800] [1465453815.279] 503

[09/jun/2016:14:30:15 +0800] [1465453815.279] 503

[09/jun/2016:14:30:17+0800] [1465453817.300] 200

[09/jun/2016:14:30:17+0800] [1465453817.300] 200

[09/jun/2016:14:30:17+0800] [1465453817.300] 200

[09/jun/2016:14:30:17+0800] [1465453817.301] 200

[09/jun/2016:14:30:17 +0800] [1465453817.301] 503

[09/jun/2016:14:30:17 +0800] [1465453817.301] 503

================================

The bucket capacity is 3 (that is, the bucket has a maximum of 3 requests in the time window, and the request is processed at a fixed rate of 2r/s (that is, processing a request every 500 milliseconds), the Bucket Calculation time window (1.5 seconds) = rate (2R/S)/barrel Capacity (3), that is, the bucket holds up to 3 requests in this time window So we're going to push forward 1.5 seconds and 1 seconds in the current time to calculate the total number of requests in the time window, and because Nodelay is configured, non-latency mode, allowing burst requests within the time window, and two questions from this example:

The first and seventh rounds: there are 4 requests processed successfully; This is because of the problem with the calculation algorithm, this example is if there is no request within 2 seconds, and then suddenly a lot of requests, the results of the first calculation will be incorrect; the problem is very small to ignore;

Fifth round: 1 seconds calculated is 3 requests; Here is the problem of computational accuracy, that is to say, limit_req implementation of the algorithm is not very accurate, assuming here as opposed to 2.75, 1 seconds only 1 requests, so still allow 1 requests.

If the limit outflow is wrong, you can configure the error page:

================================

proxy_intercept_errorson;recursive_error_pageson;error_page 503//www.jd.com/error.aspx;

================================

Limit_conn_zone/limit_req_zone defined memory is low, subsequent requests will always be limited, so the corresponding memory size needs to be set according to the requirements.

The current limit is single nginx, assuming that we have more than one Nginx access layer, where there is the same problem with the application of the same level of current, how to deal with it. A solution: Establish a Load balancer layer will follow the current limit key for the consistent hashing algorithm to hash the request to the access layer nginx, so that the same key will hit the same access layer nginx; Another solution is to use nginx+ Lua (Openresty) calls the implementation of a distributed current-limiting logic.

lua-resty-limit-traffic

Before the introduction of the two modules are relatively simple to use, specify key, specify the flow rate and so on can be, if we want to change the actual situation of key, change rate, change the size of the bucket and other dynamic characteristics, the use of standard modules is difficult to achieve, so we need a programmable to solve our problems While Openresty provides the LUA current limit module lua-resty-limit-traffic, which can be dynamically current-limited with more complex business logic. It provides Limit.conn and Limit.req implementations, and the algorithm is the same as Nginx Limit_conn and Limit_req.

Here we implement the "scenario 2.2 test" in Ngx_http_limit_req_module, and don't forget to download the Lua-resty-limit-traffic module and add it to openresty lualib.

Configure a shared dictionary to hold the current limit:

================================

Lua_shared_dict Limit_req_store 100m;

================================

The following is a current-limit code Limit_req.lua that implements scenario 2.2 testing:

================================

Local limit_req = require "Resty.limit.req"

Local rate = 2--Fixed average speed 2r/s

Local burst = 3-Bucket capacity

Local Error_status = 503

Local Nodelay = False-does not require deferred processing

Local Lim, err = limit_req.new ("Limit_req_store", rate, burst)

If not Lim then--No shared dictionary defined

Ngx.exit (Error_status)

End

Local key = ngx.var.binary_remote_addr current limit for--ip dimension

--Inflow request, delay > 0 If request needs to be delayed

Local delay, err = lim:incoming (key, True)

If not delay and err = = "Rejected" then--out of bucket size

Ngx.exit (Error_status)

End

If delay > 0 THEN--decision as required is delay or no delay processing

If Nodelay Then

--immediate burst processing.

Else

Ngx.sleep (delay)--deferred processing

End

End

================================

That is, the current limit logic and Nginx access phase is accessed, if the flow is not limited to continue the follow-up process, if you need to be limited to the flow or sleep for a period of time to continue the subsequent process, or return the corresponding status code to reject the request.

In distributed current limiting, we use a simple nginx+lua for distributed current limiting, and this module can also be used to implement distributed current limiting.

In addition, when using Nginx+lua, ngx.var.connections_active can also be acquired for overload protection, that is, if the number of active connections exceeds the threshold for current-limit protection.

================================

If Tonumber (ngx.var.connections_active) >= tonumber (limit) then//current limit End

================================

To the author in the work involved in the current limit usage is introduced, some of these algorithms allow bursts, some will be shaped to smooth, some computational algorithms are simple and rough; the token bucket algorithm and leaky bucket algorithm implementation is similar, but the direction of the expression is not the same, for business need not deliberately to distinguish them Therefore, it is necessary to decide how to limit the flow according to the actual scenario, the best algorithm is not necessarily the most applicable.

Wen/meng_philip123 (author of Jane's book)
Original link: http://www.jianshu.com/p/b7945524a37b
Copyright belongs to the author, please contact the author to obtain authorization, and Mark "book author".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.