This article mainly for you in detail the Python thread pool ThreadPool implementation, with a certain reference value, interested in small partners can refer to
This article for everyone to share the threadpool thread pool all the operations, for your reference, the specific content as follows
First introduce your own use of the noun:
Worker thread: Creates a worker thread, waits for a get task from the task queue when the thread pool is created, according to the specified number of threads;
Task (requests): a task that is processed by a worker thread that may have thousands of tasks, but only a few worker threads. The task is created by makerequests.
Task Queue (Request_queue): The queue that holds the task, which is implemented using queue. The worker thread handles the get task from the task queue;
Task handler function (callable): After the worker thread get to the task, the task handler function that invokes the task is (REQUEST.CALLABLE_) the specific processing task, and returns the processing result;
Task result Queue (Result_queue): After the task processing is completed, the returned processing results are put into the task result queue (including exceptions);
Task exception handler function or callback (Exc_callback): Get result from task result queue, if exception is set, call exception callback to handle exception;
Task result Callback (callback): Results are processed further from the result queue of the task results.
The previous section describes the installation and use of the thread pool ThreadPool, which focuses on the main process of threading pooling:
(1) Creation of thread pool
(2) Start of worker thread
(3) Creation of tasks
(4) Push the task to the thread pool
(5) Thread processing tasks
(6) End-of-mission processing
(7) Exit of worker thread
Here is the definition of ThreadPool:
Class ThreadPool: "" A thread pool, distributing work requests and collecting results. See the module docstring for more information. "" " def __init__ (self, num_workers, q_size=0, Resq_size=0, poll_timeout=5): pass def createworkers (self, num_ Workers, poll_timeout=5): pass def dismissworkers (self, Num_workers, do_join=false): pass def Joinalldismissedworkers (self): pass def putrequest (self, request, Block=true, Timeout=none): Pass def poll (self, block=false): pass def wait (self): Pass
1. Creation of thread pool (ThreadPool (args))
Task_pool=threadpool. ThreadPool (Num_works)
Task_pool=threadpool. ThreadPool (Num_works) def __init__ (self, num_workers, q_size=0, Resq_size=0, poll_timeout=5): "" "Set up the thread p Ool and start num_workers worker threads. ' Num_workers ' is the number of worker threads to start initially. If ' q_size > 0 ' the size of the work *request queue* was limited and the thread pool blocks when the queue was full And it tries to put the requests in it (see "putrequest" method), unless you also use a positive ' timeou T ' value for ' ' Putrequest '. If ' resq_size > 0 ' the size of the *results queue* is limited and the worker threads would block when the queue is Full and they try-to-put new results in it. .. Warning:if you set both "Q_size" and "Resq_size" to "! = 0" There is the possibilty of a deadlock, when The results queue is not pulled regularly and too many jobs be put in the work requests queue. To prevent the, always set ' timeout > 0 ' whEn calling ' threadpool.putrequest () ' and Catch ' ' queue.full ' exceptions. "" "Self._requests_queue = Queue.queue (q_size) #任务队列, Tasks created through threadpool.makereuests (args) are placed in this queue Self._results_queu E = Queue.queue (resq_size) #字典, task execution results for tasks </span> self.workers = [] #工作线程list, via Self.createworkers () A worker thread created within a function is put into this worker list self.dismissedworkers = [] #被设置线程事件并且没有被join的工作线程 self.workrequests = {} #字典, and the record task is assigned to which worker thread </span> self.createworkers (num_workers, Poll_timeout)
Where the initialization parameters are:
Num_works: Number of threads in thread pool
Q_size: The length limit of the task queue, if the length of the queue is limited, then when the call Putrequest () adds a task, when the limit length is reached, then Putrequest will continue to attempt to add the task, unless a timeout or block is set in Putrequest ();
Esq_size: The length of the task result queue;
Pool_timeout: The worker thread will block pool_timeout if the request is not read from the request queue, and return directly if no request is still in process;
Where the member variable:
Self._requests_queue: Task queue, Tasks created by threadpool.makereuests (args) are placed in this queue;
Self._results_queue: Dictionaries, tasks corresponding to task execution
Self.workers: Worker thread list, a worker thread created within the Self.createworkers () function is placed in the list of this worker thread;
Self.dismisssedworkers: A thread event is set and no worker thread is being join
Self.workrequests: A dictionary that records the tasks pushed to the thread pool, structured as requestid:request. Where RequestID is the unique identity of the task, it is described later.
2. Start of worker thread (Self.createworks (args))
function definition:
def createworkers (self, Num_workers, poll_timeout=5): "" " Add num_workers worker threads to the pool. ' Poll_timout ' sets the interval in seconds (int. or float) for how ofte threads should check whether they is dismiss Ed, while waiting for requests. "" " For I in Range (num_workers): self.workers.append (Workerthread (Self._requests_queue, Self._results_queue, Poll_timeout=poll_timeout))
where Workerthread () inherits from thread, a python built-in threading class that puts the created Workerthread object into the self.workers queue. Here's a look at the definition of the Workerthread class:
From self.__init__ (args), you can see:
Class Workerthread (threading. Thread): "" "Background thread connected to the Requests/results queues. A worker thread sits in the background and picks up work requests from one queue and puts the results in another until I T is dismissed. "" "Def __init__ (self, requests_queue, Results_queue, Poll_timeout=5, **kwds):" "" Set up thread in Daemonic mode an D Start it immediatedly. ' Requests_queue ' and ' results_queue ' is instances of ' queue.queue ' passed by the ' ThreadPool ' class when it C Reates a new worker thread. "" "Threading. Thread.__init__ (self, **kwds) Self.setdaemon (1) # self._requests_queue = requests_queue# Task Queue Self._results_queu E = results_queue# task result queue Self._poll_timeout = Poll_timeout#run function in the task queue when the get task is timed out, if time-out continues while (true); Self._dismi ssed = Threading. Event () #线程事件, if the set thread event then run executes a break and exits the worker thread directly; Self.start () def run (self): "" "repeatedly process the job queue UN Til told to exit. "" " While True: If Self._dismissed.isset (): #如果设置了self. _dismissed exits the worker thread # We are dismissed, the break out of the loop break # Get next work request. If we don ' t get a new request from the # queue after self._poll_timout seconds, we jump to the start of # the While loop again, to give the thread a chance to exit. Try:request = Self._requests_queue.get (True, self._poll_timeout) except Queue.empty: #尝从任务 queue Self._requests_ The get task in the queue, if the queue is empty, then continue continue else:if self._dismissed.isset (): #检测此工作线程事件是否被set, if set, means to knot Bundle This worker thread, you will need to return the fetch to the task queue and exit the thread # We are dismissed, put the back request in queue and exit loop Self._re Quests_queue.put (Request) Break Try:<span style= "color: #如果线程事件没有被设置, then perform the task handler function request.callable and return the R Esult, pressing into the task results queue result = Request.callable (*request.args, **request.kwds) self._results_queue.put ((req Uest, result)) Except:request.exception = True self._results_queue.put (Request, Sys.exc_info ())) #如果任务处理函数出现异常, the exception is pressed into the queue def dismiss (self): </SP An> "" "sets a flag to tell that the thread to exit when do with the current job. "" "Self._dismissed.set ()
Initialize the variables:
Self._request_queue: Task queue;
SELF._RESUTLS_QUEUQE,: task result queue;
The time-out of the Self._pool_timeout:run function when the get task is from the task queue, and if the timeout continues while (true);
Self._dismissed: Thread event, if set thread event, run will execute break and exit worker thread directly;
Finally call Self.start () to start the thread, and the run function definition is shown above:
From the run function above, perform the following steps:
(1) If self._dismissed is set, exit the worker thread, or perform step 2nd
(2) Tasting the Get task from the task queue Self._requests_queue, if the queue is empty, continue executes the next while loop, or 3rd step
(3) detects if this worker thread event is set, and if set, means to end this worker thread, then it needs to return the task to the task queue and exit the thread. If the thread event is not set, then the task handler function request.callable and the returned result is pressed into the task result queue, and the exception is pressed into the queue if there is an exception to the task handler function. Last Jump 4th step
(4) Continue the loop, return 1
Until this worker thread is created, the worker thread is created, based on the number of thread pool threads set, the worker threads get tasks from the task queue, the task is processed, and the task processing results are pressed into the task results queue.
3. Creation of tasks (makerequests)
The task's creation function is threadpool.makerequests (callable_,args_list,callback=none):
# utility Functions def makerequests (Callable_, Args_list, Callback=none, exc_callback=_handle_thread_exception): "" "Create several work requests for same callable with different arguments. Convenience function for creating several work requests for the same callable where each invocation of the callable Rece Ives different values for its arguments. ' Args_list ' contains the parameters for each invocation of callable. Each item in "Args_list" should be either a 2-item tuple of the list of positional arguments and a dictionary of Keywo RD arguments or a single, non-tuple argument. See docstring for "workrequest" for info on ' callback ' and ' Exc_callback '. "" "Requests = [] for item in Args_list:if isinstance (item, tuple): Requests.append (Workrequest (ca Llable_, Item[0], item[1], Callback=callback, exc_callback=exc_callback)) else:requests.append (Workrequest (Callable_, [item], None, callback=cAllback, Exc_callback=exc_callback)) return requests
The function parameter in which the task is created is the specific meaning:
Callable_: Registers a task handler function that executes this callable_ when a task is placed on a task queue and the worker thread gets to the task's thread.
Args_list: First args_list is a list, the list element type is a tuple, there are two elements in the tuple item[0],item[1],item[0] is the positional parameter, and item[1] is the dictionary type keyword parameter. The number of tuples in the list, representing the number of tasks launched, typically creates a task for a single tuple, or one makerequest () when used.
Callback: callback function, called in the poll function (explained later), the Callable_ call ends, the task results are placed in the task result queue (Self._resutls_queue), in the poll function, when the Self._ This callback (Request,result) is executed when a result is obtained in the Resutls_queue queue, where result is the results returned by the request task.
Exc_callback: Exception callback function, in the poll function, this exception callback is called if a request pair should have an exception to be executed.
After you create the completed task, return to the task that you created.
The outer layer records this task and puts it into the task list.
Above is the function for creating the task, which explains the structure of the Task object:
Class Workrequest: "" A request to execute a callable for putting in the request queue later. See the module function "Makerequests" for the common case where you want to build several "workrequest" objects for The same callable but with different arguments for each call. "" "Def __init__ (self, callable_, Args=none, Kwds=none, Requestid=none, Callback=none, Exc_callback=_handle_threa d_exception): "" "Create a work request for a callable and attach callbacks. A work request consists of the a callable to being executed by a worker thread, a list of positional arguments, a diction ary of keyword arguments. A ' callback ' function can be specified, that's called when the results of the request was picked up from the result Queue. It must accept the anonymous arguments, the ' workrequest ' object and the results of the callable, in that order. If you want to pass additional information to the callback and just stick it on the request OBject. You can also give custom callback a exception occurs with the ' exc_callback ' keyword parameter. It should also accept anonymous arguments, the ' workrequest ' and a tuple with the exception details as Retur Ned by ' Sys.exc_info () '. The default implementation of this callback just prints the exception info via ' Traceback.print_exception '. If you want no exception handler callback, just pass in ' None '. ' RequestID ', if given, must be hashable since it's used by ' ThreadPool ' object to store the results of the Request in a dictionary. It defaults to the return value of "ID (self)". "" "if RequestID is None:self.requestID = ID (self) else:try:self.requestID = hash (RequestID ) except Typeerror:raise TypeError ("RequestID must be hashable.") Self.exception = False Self.callback = Callback Self.exc_callback = Exc_callback self.callable = Callable_ Self.args = args or [] Self.kwds = Kwds or {} def __str__ (self): return "<workrequest id=%s args=%r kwargs= %r exception=%s> "% \ (Self.requestid, Self.args, Self.kwds, self.exception)
Above Self.callback and Self.exc_callback, and Self.callable_, Args,dwds have been explained, it is not wordy.
There is a globally unique identifier for a task, that is, Self.requestid, by acquiring its own memory header address as its own unique ID (self)
Self.exception is initialized to false, this variable is set to True if an exception occurs during the execution of Self.callable ().
At this point, the task is created, and the top record of calling Makerequests () has a task list request_list.
4. Push the task to the thread pool (putrequest)
The above section describes the creation of tasks, the number of tasks can be hundreds of thousands, but the number of threads that handle tasks is handled only by the number of threads we make when creating the thread pool, and the number of threads specified is often much smaller than the number of tasks, so each thread must handle multiple tasks.
This section describes how to push the created task into the thread pool so that the thread pools are blocked by the state, get the task, and then go to work on the task.
The push of a task is created using Putrequest (self,request,block,timeout) in the threadpool thread pool class:
def putrequest (self, request, Block=true, Timeout=none): "" "Put work request into work queue and save it ID for later .""" Assert isinstance (Request, Workrequest) # don ' t reuse old work requests assert not getattr (request, ' exception ', None) Self._requests_queue.put (request, block, timeout) Self.workrequests[request.requestid] = Request
The main role of the function is to put the request task, which is the task created in the previous section, into the task queue (self._request_queue) of the thread pool. It then records the tasks that have been pushed to the thread pool, stored through the self.workreuests Dictionary of the thread pool, and structured as request.requestID:request.
At this point, the task creation is complete and the task has been pushed into the thread pool.
5. Thread Processing Tasks
With the previous section, the task has been pushed into the thread. When a task is not pushed into the thread pool, the threads in the thread pool are in a blocking state, that is, in the Self.run () function of the threads, all the time:
Try: request = Self._requests_queue.get (True, self._poll_timeout) except Queue.empty: #尝从任务 queue Self._requests_ The get task in the queue, if the queue is empty, continue continue
Now that the task has been pushed into the thread pool, the get task will return normally, performing the following steps:
def run: "" "repeatedly process the job queue until told to exit." " While True:if Self._dismissed.isset (): #如果设置了self. _dismissed quits the worker thread # We are dismissed, break out of loop Break # Get next work request. If we don ' t get a new request from the # queue after self._poll_timout seconds, we jump to the start of # the While loop again, to give the thread a chance to exit. Try:request = Self._requests_queue.get (True, self._poll_timeout) except Queue.empty: #尝从任务 queue Self._requests_ The get task in the queue, if the queue is empty, then continue continue else:if self._dismissed.isset (): #检测此工作线程事件是否被set, if set, means to knot Bundle This worker thread, you will need to return the fetch to the task queue and exit the thread # We are dismissed, put the back request in queue and exit loop Self._re Quests_queue.put (Request) Break try: #如果线程事件没有被设置, then perform the task handler function request.callable and press the returned result into the task result queue result = Request.callable (*request.args, **request.kwds) self._resUlts_queue.put (request, result)) Except:request.exception = True Self._results_queue.put (re Quest, Sys.exc_info ())) #如果任务处理函数出现异常, the exception is pressed into the queue
Get Task---> Call handler for Task callable () Processing Task---> Press the task request and the results returned by the task into the Self.results_queue queue----> If the task handler function is abnormal, Set the task exception ID to true and press the task request and the task exception into the self.results_queue queue----> Return to the Get task again
If, during the while loop, the thread event is set externally, that is, Self._dismissed.isset is true, it means that the thread will end the processing task, then the task queue will be returned to the get to task and exit the thread.
6. End-of-mission processing
In the above section, we describe the thread pool's constant get task and continue to process the task. So what do we do after each task, the thread pool provides wait () and the poll () function.
When we submit the task to a thread pool, we call wait () to the end of the task processing, and wait () is returned after the end, and we can do the next step, such as recreating the task, pushing the task to the thread pool, or ending the thread pool. The end thread pool is described in the next section, which mainly describes the wait () and poll () operations.
First look at the Wait () operation:
def wait (self): "" " wait for results, blocking until all has arrived. " " While 1: try: self.poll (True) except noresultspending: break
Wait for the task to finish, until all task processing is complete, until the block stage is completed, if Self.poll () returns an exception noresultspending exception, and wait returns, the task processing ends.
Here's a look at the poll function:
Def poll (self, Block=false): "" " Process any new results in the queue. " "" While True: # Still results pending? If not self.workrequests: raise noresultspending # is there still workers to process remaining requests? Elif block and not self.workers: raise Noworkersavailable try: # get back Next results request, result = Self._results_queue.get (Block=block) # have an exception occured? If Request.exception and Request.exc_callback: request.exc_callback (Request, result) # Hand results to Callback, if any if request.callback and not \ (Request.exception and Request.exc_callback): Request.callback (request, result) del Self.workrequests[request.requestid] except Queue.empty: Break
(1) First, the detection task dictionary ({request.requestID:request}) is empty, if empty then throws an exception noresultpending end, otherwise to the 2nd step;
(2) detects if the worker thread is empty (if a thread's thread event is set, the worker thread exits and pops out of the self.workers), and if NULL throws the noworkeravailable exception, otherwise enters the 3rd step;
(3) Get task result from task result queue, if throw queue is empty, then break, return, otherwise enter 4th step;
(4) If an exception occurs during the task processing, request.exception is set, and the exception handling callback is set, then Request.exc_callback executes the exception callback, and then the callback handles the exception. After returning, remove the task from the Task List self.workrequests, continue the Get task, and return to the 1th step. Otherwise enter the 5th step;
(5) If a task result callback is set and Request.callback is not empty, then the task result callback is Request.callbacl (Request,result), and
Remove the task from the Task List self.workrequests, continue with the get task, and return to the 1th step.
(6) Repeat the above steps until the exception is thrown, or the task queue is empty, then the poll is returned;
At this point, after the noresultpending wait operation accepts this exception, this wait () is returned.
7. Exit of worker thread
The threadpool provided by the worker thread exits with the Dismissworkers () and Joinalldismissedworker () operations:
def dismissworkers (self, Num_workers, do_join=false): "" "Tell num_workers worker threads to quit after their current T Ask. "" " Dismiss_list = [] for I in range (min (num_workers, Len (self.workers))): worker = Self.workers.pop () Worker.dismiss () dismiss_list.append (worker) if Do_join: for worker in Dismiss_list: worker.join () Else: self.dismissedWorkers.extend (dismiss_list) def joinalldismissedworkers (self): "" " Perform Thread.Join () on all worker threads, that has been dismissed. "" For worker in Self.dismissedworkers: worker.join () self.dismissedworkers = []
As can be seen from Dismissworkers, the main work is to pop out the specified number of threads from the Self.workers worker thread, and set the thread event for this thread, after the thread event is set, this thread Self.run () function will detect this setting and end the thread.
If you set the thread in Do_join, where join exits in this function, the join operation is performed on the exited thread. Otherwise, the pop-out thread is put into self.dismissedworkers to wait for the joinalldismissedworkers operation to process the join thread.
8. Summary
At this end, all the operations in the ThreadPool thread pool are described, and the implementation is described in detail. As can be seen from the above, the thread pool is not so complex, only a few simple operations, mainly to understand the entire processing process.
I hope you will make a lot of suggestions and comments.