The project uses the current limit, limited by some of the implementation of the way things, the hand tore a simple service-end current limiter.
The difference between end-of-stream and client-side throttling is simply:
1) service-side limit flow
Limit the number of requests in a unit of time, in order to gain high availability through compromise.
For example, our scenario is that there is a service receiving a request, after processing, the data bulk to Elasticsearch for index storage, the bulk index is a resource-intensive operation, if the request traffic surges, may be overwhelmed Elasticsearch (queue blocking, Memory spikes), there is a need to limit the peak traffic.
2) Client Current limit
The limit is the number of times the client accesses.
For example, the thread pool is a natural current limiter. Limit the number of concurrent max_connection, a lot of put in the buffer queue queued, queued up >queue_size throw away.
This article is a service-side current limiter.
Advantages of this current limiter:
Disadvantages:
- 1) cannot smooth current limit
For example, let's try the token bucket algorithm and the leaky bucket algorithm (I feel that both algorithms are essentially one thing) to achieve a smooth current limit.
What is a smooth current limit? For a chestnut, we want to limit the number of visits in 5 seconds not exceeding 1000, smooth current limit can be achieved, 200 per second, 5 seconds not more than 1000, very balanced; non-smooth current limit may be visited 1000 times in the first second, and then 4 seconds all are limited.
Only a second level of current limit is implemented.
Supports two scenarios:
1) For single-process multithreaded scenarios (global variables using thread-safe queue)
In this scenario, only one instance is deployed to limit the flow of this instance. Rarely used in production environments.
2) for multi-process distributed scenarios (using Redis to make global variables)
Multi-instance deployments, typically in production environments, are examples of such usage scenarios.
In such a scenario, the overall flow of traffic needs to be controlled. For example, the user Service deploys three instances, exposing the query interface to an interface-level traffic limit, that is, how much of a peak is allowed for the overall query interface, without worrying about which instance to load.
Digression, this can be done by Nginx.
The following is the realization of a low-limit flow device.
1, Interface Baseratelimiter
According to my thinking, first define an interface, or it can be called an abstract class.
When initializing, the rate is configured and the speed limit of the current limiter is set.
Provides an abstract method, acquire (), that calls this method, returning whether to limit traffic.
class Baseratelimiter (object): __metaclass__ = ABC. Abcmeta @abc. Abstractmethod def__init__(self, rate): = Rate @abc. Abstractmethod def acquire (self, count): return
2, single-process multi-threaded scene current limit Threadingratelimiter
Inherit the Baseratelimiter abstract class, using thread-safe queue as a global variable to eliminate race effects.
There is a process in the background that empties the queue every second;
When the request comes in, call the acquire function, the queue incr once, and if it is greater than the speed limit, return the limit. Otherwise, access is allowed.
classThreadingratelimiter (baseratelimiter):def __init__(self, rate): Baseratelimiter.__init__(self, rate) Self.queue=Queue.queue () threading. Thread (Target=self._clear_queue). Start ()defAcquire (self, count=1): Self.queue.put (1, block=False)returnSelf.queue.qsize () <self.ratedef_clear_queue (self): while1: Time.sleep (1) self.queue.queue.clear ()
2. Current-limiting distributeratelimiter in distributed scenarios
Inherits the Baseratelimiter abstract class, which uses external storage as a shared variable, and the external storage is accessed by the cache.
classDistributeratelimiter (baseratelimiter):def __init__(self, Rate, cache): Baseratelimiter.__init__(self, rate) Self.cache=CachedefAcquire (self, count=1, expire=3, Key=none, callback=None):Try: ifisinstance (Self.cache, cache):returnSelf.cache.fetchToken (Rate=self.rate, Count=count, Expire=expire, key=key)exceptException, ex:returnTrue
To understand the decoupling and flexibility, we implemented the cache class. Provides an abstract method GetToken ()
If you use Redis, you inherit the cache abstract class, which implements the method of acquiring tokens through Redis.
If you use MySQL, you inherit the cache abstract class and implement the method of obtaining tokens through MySQL.
Cache abstract Class
class Cache (object): __metaclass__ = ABC. Abcmeta @abc. Abstractmethod def__init__(self): " DEFAULT"" Ratelimiter "@ Abc.abstractmethod def fetchtoken (self, rate, key=None): return
Give the implementation of a Redis Redistokencache
Create a key every second, and count the requests incr, when this second count has exceeded the speed limit rate, it will not get token, that is, limit the traffic.
For the key created every second, let him time out expire. Ensure that the key does not continue to occupy storage space.
There is no difficulty in using Redis transactions here to ensure that both incr and expire can execute successfully at the same time.
classRedistokencache (Cache):def __init__(Self, host, port, Db=0, Password=none, max_connections=None): Cache.__init__(self) Self.redis=Redis. Redis (Connection_pool=Redis. ConnectionPool (Host=host, Port=port, db=db, Password=Password, max_connections=max_connections)) defFetchtoken (self, rate=100, count=1, expire=3, key=None): Date= DateTime.Now (). Strftime ("%y-%m-%d%h:%m:%s") Key=":". join ([Self.namespace, keyifKeyElseSelf.key, Date]) Try: Current=self.redis.get (Key)ifInt (currentifCurrentElse "0") >Rate :RaiseException ("To many requests in current second:%s"%date)Else: With Self.redis.pipeline () as P:p.multi () p.incr (key, Count) P.expire (key, int (expireifExpireElse "3") ) P.execute ()returnTrueexceptException, ex:returnFalse
Test code in multithreaded scenarios
Limiter = Threadingratelimiter (rate=10000)defJob (): while1: if notLimiter.acquire ():Print 'Current Limit' Else: Print 'Normal'Threads= [Threading. Thread (Target=job) forIinchRange (10)] forThreadinchThreads:thread.start ()
Test code in distributed scenarios
Token_cache = Redistokencache (host='10.93.84.53', port=6379, password='bigdata123') Limiter= Distributeratelimiter (rate=10000, cache=Token_cache) R= Redis. Redis (Connection_pool=redis. ConnectionPool (host='10.93.84.53', port=6379, password='bigdata123'))defJob (): while1: if notLimiter.acquire ():Print 'Current Limit' Else: Print 'Normal'Threads= [multiprocessing. Process (Target=job) forIinchRange (10)] forThreadinchThreads:thread.start ()
You can run on your own.
Description
My speed limit here is the second level, such as limiting 400 requests per second. It is possible that the first 100ms of this second, there are 400 requests, after 900MS is all limited. That is, the flow limit cannot be smoothed.
But if your back-end logic has a queue, or a buffer like the thread pool, the effect of this is not very smooth.
Current limiter in Python distributed environment