Python process pool Detailed anatomy tutorial

Source: Internet
Author: User
Tags assert constructor data structures generator iterable terminates in python

Two of the modules commonly used to process processes in Python are subprocess and multiprocessing, where subprocess is typically used to execute external programs, such as third-party applications, rather than Python programs. If you need to implement the ability to invoke external programs, Python's Psutil module is a better choice, not only to support the functionality provided by subprocess, but also to monitor the current host or the external programs that are started, such as access to the network, CPU, memory, etc. The support is more comprehensive when doing some automated operation and maintenance work. Multiprocessing is Python's multi-process module, which, by starting the Python process, invokes the target callback function to handle the task, which corresponds to Python's Multithreaded module threading, which has a similar interface, By defining multiprocessing. Process, Threading. Thread, specify the target method, and invoke the start () to run the process or thread.





In Python, because of the existence of the global Interpretation Lock (GIL), the use of multiple threads does not greatly improve the efficiency of the program "1". Therefore, when dealing with concurrency problems with Python, try to use multiple processes rather than multithreading. In concurrent programming, the simplest pattern is that the main process waits for the task and launches a new process to handle the current task when a new task arrives. This is how each task is handled by one process, each processing task is accompanied by the creation, operation, and destruction of a process, and if the process runs shorter, the more time it takes to create and destroy, it is clear that we should try to avoid the extra overhead of creating and destroying the process itself, and improve the efficiency of the process. We can use process pooling to reduce the creation and overhead of processes and to improve the reuse of process objects.





In fact, Python has implemented a powerful process pool (multiprocessing). Pool), let's take a quick analysis of how Python's own process pool is implemented.





To create a process pool object, you need to call the pool function, which declares the following:








Pool (Processes=none, Initializer=none, initargs= (), Maxtasksperchild=none)


Returns a Process Pool object


Processes represents the number of worker processes, default to None, indicating the number of worker processes as Cpu_count ()


Initializer represents the initialization function that is called when the worker process start, Initargs represents the parameters of the initializer function, and if initializer is not none, the initializer is invoked before each worker process start (* Initargs)


Maxtaskperchild represents the number of work tasks that need to be completed before exiting/being replaced by another new process, which defaults to none, indicating that the worker process survives the same time as pool, i.e. does not automatically exit/be replaced.


function returns a process pool object








Some of the following data structures are in the process pool object returned by the pool function:





Self._inqueue Receive task Queue (SIMPLEQUEUE) for the main process to send a task to the worker process


Self._outqueue sends the result queue (Simplequeue) for the worker process to send the results to the main process


Self._taskqueue a synchronized task queue, saving the task assigned to the main process by the thread pool


Self._cache = {} task cache


Number of self._processes worker processes


Self._pool = [] woker process queue





Receive and assign tasks when the process pool is working. The return of the result, which is done by each thread within the process pool, is seen by those threads within the process pool:





The _work_handler thread is responsible for ensuring that the worker process in the process pool creates a new worker process and adds it to the process queue (pools), keeping the worker process in the process pool always processes, with an exit. The _worker_handler thread callback function is the Pool._handler_workers method, which loops through the _maintain_pool method when the process pool State==run, monitors whether a process exits, and creates a new process. Append to the process pool pools, keep the number of worker processes in the process pool always processes.








    self._worker_handler = threading. Thread (                target= Pool._handle_workers,                  args= (self, )     )     pool._handle_workers method in _worker_ Handler thread state is Run-time (Status==run), loop Invoke _maintain_pool method:     def _maintain_pool (self):          if self._join_exited_workers ():              self._repopulate_pool ()     _join_exited_workers ( To monitor whether a process in the pools queue has an end, wait for it to end, remove it from pools, and when there is a process at the end, call _repopulate_pool () and create a new process:     w =  self. Process (Target=worker,                     args= (self._inqueue,  self._outqueue,                            self._initializer, self._initargs,                     &N Bsp;               self._ Maxtasksperchild)                       )     self._pool.append (W)

    W is the newly created process, which is the process used to handle the actual task, and the worker is its callback function:

    def worker (inqueue, outqueue, initializer=none, initargs= (),  Maxtasks=none):         assert maxtasks is none or   (Type (maxtasks)  == int and maxtasks > 0)          put = outqueue.put         get =  Inqueue.get         if hasattr (inqueue,  ' _writer '):              inqueue._writer.close ()              outqueue._reader.close ()          if initializer is not none:              initializer (*initargs)         completed  = 0         while maxtasks is none or  (maxtasks  and completed < maxtasks):              try:                  task = get ()             except   (eoferror, ioerror):                  debug (' worker got eoferror or ioerror -- exiting ')                  break              if task is none:                  debug (' Worker got sentinel  -- exiting')                 break              job, i, func, args, kwds  = task             try:                  result =  (True,  func (*args, **kwds))             except  exception, e:                  result =  (false, e)              try:                  put ((job, i, result))              Except exceptioN as e:                  wrapped = maybeencodingerror (e, result[1])                  debug ("Possible encoding error while  sending result: %s " %  (                     wrapped))       
          put ((job, i,  (False, wrapped))             completed += 1          debug (' Worker exiting after %d tasks '  %  Completed




All worker processes use the worker callback function to process tasks uniformly, as can be seen from the source code:


Its function is to read a task task from the Access Task Queue (Inqueue) and then invoke it according to the function and parameters of the task (result = (True, func (*args, **kwds),


The results are then placed in the result queue (Outqueue), and if there is a limit of maximum handling limit maxtasks, then the process exits when processing to the number of tasks.








The _task_handler thread, which is responsible for removing the task from the Task_queue in the process pool, into the Receive task queue (Pipe),





Self._task_handler = Threading. Thread (


Target=pool._handle_tasks,


Args= (Self._taskqueue, Self._quick_put, Self._outqueue, Self._pool)


)


The Pool._handle_tasks method constantly gets the task from the Task_queue and puts it in the accepted task queue (in_queue), triggering the worker process for task processing. When you read from Task_queue to the None element,


Indicates that the process pool is to be terminated (terminate), the task request is no longer processed, and the none element is put on the Accept task queue and the result task queue, notifying the other thread of the end.





_handle_results thread, which is responsible for the processing of the task results, from the Outqueue (Pipe) read out, placed in the task cache buffer,





Self._result_handler = Threading. Thread (


Target=pool._handle_results,


Args= (Self._outqueue, Self._quick_get, Self._cache)


)





_terminate, the _terminate here is not a thread, but a finalize object











Self._terminate = Finalize (


Self, Self._terminate_pool,


Args= (Self._taskqueue, Self._inqueue, Self._outqueue, Self._pool,


Self._worker_handler, Self._task_handler,


Self._result_handler, Self._cache),


Exitpriority=15


)


The constructor of the finalize class is similar to the thread constructor, _terminate_pool is its callback function, args the parameters of the callback function.


The _terminate_pool function terminates the work of the process pool: Terminates the three threads above, terminates the worker process in the process pool, and clears the data in the queue.


_terminate is an object rather than a thread, so how does it execute a callback function like a thread calling the start () method _terminate_pool? See pool source, Discover process pool termination function:


def terminate (self):


Debug (' Terminating pool ')


Self._state = TERMINATE


Self._worker_handler._state = TERMINATE


Self._terminate ()


The _terminate object is executed as a method at the end of the function, and the _terminate itself is a Finalize object, we look at the definition of the Finalize class and find that it implements the __call__ method:


def __call__ (self, Wr=none):


Try


Del _finalizer_registry[self._key]


Except Keyerror:


Sub_debug (' Finalizer no longer registered ')


Else


If Self._pid!= os.getpid ():


res = None


Else


res = Self._callback (*self._args, **self._kwargs)


Self._weakref = Self._callback = Self._args = \


Self._kwargs = Self._key = None


return res


In the method Self._callback (*self._args, **self._kwargs) This statement executes the _terminate_pool function, which terminates the process pool.








The data structure in the process pool, the partnership between individual threads, is shown in the following illustration:


"1" Here is for CPU-intensive programs, multithreading does not bring efficiency gains, but also may be due to frequent thread switching, resulting in reduced efficiency; if it is IO-intensive, multithreaded processes can use IO to block waiting idle time to perform other threads to improve efficiency.


We know that when the task queue in the process pool is not empty, the worker process is triggered to work, so how do you add a task to the task queue in the process pool? The process pool class has two key methods for creating tasks, namely, Apply/apply_async and Map/map_async, In fact, the apply and map methods of the process pool class are similar to those built in Python with two identically named methods, Apply_async and Map_async are their non-blocking versions respectively.

First look at the Apply_async method, the source code is as follows:


def apply_async (self, Func, args= (), kwds={}, Callback=none):
Assert self._state = = RUN
result = Applyresult (Self._cache, callback)
Self._taskqueue.put ([(Result._job, None, Func, Args, Kwds)], None)
return result

The


Func represents the method that performs this task
args, Kwds table func positional and keyword parameters
callback represents a single parameter method, and when a result returns, the callback method is invoked and the parameter is the result of the task executing Each time the



Invokes the Apply_result method, it actually adds a task to the _taskqueue, noting that there is a non-blocking (asynchronous) invocation, namely Apply_ The new task in the Async method is only added to the task queue, has not been executed, does not need to wait, returns directly to the created Applyresult object, and notes that when the Applyresult object is created, it is placed in the process pool's cache _cache. The

Task queue has a newly created task, then according to the process flow analyzed in the previous section, the _task_handler thread of the processes pool gets the task out of the taskqueue and into the _inqueue. Triggers the worker process to invoke Func according to args and Kwds, after the run is finished, place the result in _outqueue, and then by the _handle_results thread in the process pool, remove the run result from the _outqueue and find the _ The cache Applyresult object, _set Its run result, waits for the caller to get it. If the

Apply_async method is asynchronous, how does it know the end of the task and get the result? Here you need to understand the two main methods in the Applyresult class:


Def get (Self, timeout=none):     self.wait (timeout)     if  not self._ready:         raise timeouterror   
  if self._success:         return self._value     else:         raise self._value def _ Set (self, i, obj):     self._success, self._value = obj      if self._callback and self._success:          self._callback (Self._value)     self._cond.acquire ()     try :         self._ready = true          self._cond.notify ()     finally:          self._conD.release ()     del self._cache[self._job] 


from the two method names, the Get method is provided to the client to obtain the result of the worker process running, and the result of the run is to call the _set method through the _handle_result thread, stored in the Applyresult object. The
_set method saves the run result in Applyresult._value and wakes up the Get method that is blocked on the condition variable. The client returns the run result by calling the Get method. The


Apply method is to run the fetch process result in a blocking way, and its implementation is simple, as is calling Apply_async, but instead of returning to Applyresult, it returns directly to the result of the worker process running:

def Apply (self, Func, args= (), kwds={}):
        assert self._state = = RUN
        return Self.apply_async (func, args, Kwds). Get ()

above Apply/apply_ Async method, you can assign only one task to the process pool at a time, and you use the Map/map_async method if you want to assign multiple tasks to the process pool at once. First let's look at how the Map_async method is defined:


Def map_async (self, func, iterable, chunksize=none, callback=none):      assert self._state == run     if not hasattr (iterable,  ' __len__ '):         iterable = list (iterable)      if chunksize is none:         chunksize,  extra = divmod (Len (iterable),  len (Self._pool)  * 4)          if extra:             
Chunksize += 1         if len (iterable)  == 0:             chunksize = 0      task_batches = pool._get_tasks (func, iterable, chunksize)      result = mapresult (Self._cache, chunksize, len (iterable),  callback)     self._ Taskqueue.put (((result._job, i, mapstar,  (x,),  {})                          
      for i, x in enumerate (task_batches)),  None)     return result

The


Func represents the method that performs this task
Iterable represents a sequence of task parameters
Chunksize indicates that the iterable sequence is split by the size of each group of chunksize. Each segmented sequence is submitted to a task in the process pool for processing
callback represents a single parameter method, and when a result returns, the callback method is invoked and the parameter is the result of the task execution

  from the source code can be seen, MAP_ Async is more complex than Apply_async, first it groups the sequence of task parameters according to Chunksize, Chunksize represents the number of tasks in each group, and when the default Chunksize=none, Calculates the number of groupings according to the sequence of task parameters and the number of processes in the process pool: chunk, extra = Divmod (len (iterable), Len (self._pool) * 4). Assuming that the process count is len (self._pool) = 4, and the task parameter sequence is iterable=range (123), then chunk=7, extra=11, execution down, and chunksize=8, indicates that the sequence of task parameters is divided into 8 groups. Tasks are actually grouped:


Task_batches = Pool._get_tasks (func, iterable, chunksize)
def _get_tasks (func, it, size):
it = iter (IT)
While 1:
x = tuple (Itertools.islice (it, size))
if not x:
return
Yiel D (func, X)

This uses yield to compile the _get_tasks method genetic builder. In fact, for a series such as range (123), after grouping by chunksize=8, the elements of a total of 16 groups per group are as follows:


(Func, (0, 1, 2, 3, 4, 5, 6, 7))


(Func, (8, 9, 10, 11, 12, 13, 14, 15))


(Func, (16, 17, 18, 19, 20, 21, 22, 23))


...


(Func, (112, 113, 114, 115, 116, 117, 118, 119))


(Func, (120, 121, 122))








After grouping, a Mapresult object is defined here: result = Mapresult (Self._cache, chunksize, Len (iterable), callback) it inherits from the Appyresult class and also provides get and _set method interface. Put the grouped tasks in the task queue, and then return the result object that you just created.








Self._taskqueue.put ((Result._job, I, Mapstar, (x,), {})


For I, X in enumerate (task_batches)), None)


Taking the task parameter sequence =range (123) as an example, in fact, a set of 16 sets of tuple elements has been put in the task queue, which in turn is:


(result._job, 0, Mapstar, ((Func, (0, 1, 2, 3, 4, 5, 6, 7)), {}, None)


(Result._job, 1, Mapstar, ((Func, (8, 9,,,,),), {}, None)


......


(Result._job, Mapstar, (func, (121, 122)), {}, None)


Note the i in each tuple, which represents the position of the current tuple within the entire set of task tuples, through which the _handle_result thread can populate the Mapresult object in the correct order with the results of the worker process running.








Note that only one put method is called and the 16 tuple is placed as a whole sequence into the task queue, so whether the task _task_handler the thread will pass the entire task sequence to the _inqueue as the Apply_async method. This causes only one worker process in the process pool to get to the task sequence, rather than the processing of the multiple processes. Let's take a look at how the _task_handler thread is handled:





Def _handle_tasks (Taskqueue, put, outqueue, pool, cache):      Thread = threading.current_thread ()     for taskseq, set_length in  iter (Taskqueue.get, none):         i = -1          for i, task in enumerate (TASKSEQ):              if thread._state:                  debug (' Task handler found  thread._state != run ')             
    break             try:
                put (Task)     &nBsp;       except exception as e:     
            job, ind = task[:2]                 try:                       cache[job]._set (ind,  (false, e))                  except keyerror:                      pass          else:             if set_ Length:                 debug (' Doing set_length () ") &NBSp;               set_length (i+1)             continue          break     else:         debug (' Task handler got sentinel ')


Takes note of the statement for I, the task in enumerate (TASKSEQ), where the original _task_handler thread was not placed directly into the _inqueue after it was fetched to the task sequence through Taskqueue. Instead, the tasks in the sequence are sorted into the _inqueue, and the task in the loop is each of these task tuples: (result._job, 0, Mapstar, (func, 0,   1,    2,   3,   4,   5,   6,   7)),), {}, None). The worker process is then triggered. The worker process obtains each set of tasks and processes the tasks:

Job, I, Func, args, Kwds = task
Try: Result
= (True, func (*args, **kwds))
except Exception, E:
result = (False, e)
try: Put
((Job, I, result))
except Exception as e:
wrapped = Maybeencodingerro R (E, result[1])
debug ("Possible encoding error while sending result:%s"% (
wrapped)) put
(j OB, I, (False, wrapped))




Correspond according to the order in which the _inqueue was placed:


(result._job, 0, Mapstar, ((Func, (0, 1, 2, 3, 4, 5, 6, 7)), {}, None)


Job, I, Func, args, Kwds = task


As you can see, the Mapstar in the tuple represent the callback functions here Func (func, (0, 1, 2, 3, 4, 5, 6, 7),) and {} represent args and Kwds parameters, respectively.


Execute result = (True, func (*args, **kwds))


Let's look at how the Mapstar is defined:


def mapstar (args):


return Map (*args)


Here Mapstar represents the callback function func, its definition has only one parameter, and when the worker process executes the callback, using the Func (*args, **kwds) statement, is there one more parameter that can be executed correctly? The answer is yes, when calling Mapstar, if Kwds is an empty dictionary, then passing in the second parameter does not affect the function's invocation, and a parameterless function func_with_none_params, using Func_with_none_params (* ()) at the time of the call. **{}) is no problem, and Python automatically ignores the two empty arguments passed in.


See here, we understand that after the task parameters are grouped, the tasks for each group are invoked through the built-in map method.


When you run the call to put (job, I, results) into the _outqueue, the _handle_result thread takes the result out of _outqueue and finds the Mapresult object in the _cache cache, _set its result











Now let's summarize how the Map_async method of the process pool works, and we'll pass this sequence of range (123) to the Map_async method, assuming that chunksize is not specified and the CPU is four cores, then the method is divided into 16 groups (0~ 14 groups of 8 elements per group and 3 elements in the last group. Put grouped tasks into the task queue, a total of 16 groups, then each process needs to run 4 times to process, each through the built-in map method, sequentially in the group of 8 tasks, and then put the results into _outqueue, find the Mapresult object in the _cache cache, _set its running results , waiting for the client to get it. By using the Map_async method, multiple worker process processing tasks are invoked, and each worler process finishes running, passing the results into _outqueue, and then the _handle_result thread writes the result to the Mapresult object. So how do you ensure that the order of the results sequence is consistent with the sequence of task parameters that are passed in when you call Map_async, and we look at the Mapresult constructor and the implementation of the _set method.





Def __init__ (self, cache, chunksize, length, callback):     
applyresult.__init__ (Self, cache, callback)     self._success = true     self._value = [none] * length     self._ Chunksize = chunksize     if chunksize <= 0:          self._number_left = 0          Self._ready = true         del cache[self._job]      else:         self._number_left = length//
Chunksize + bool (length % chunksize) Def _set (Self, i, success_result):     success, result = success_result     if success :    &nbsP;    self._value[i*self._chunksize: (i+1) *self._chunksize] = result          self._number_left -= 1          if self._number_left == 0:              if self._callback:                  self._callback (Self._value)              del self._cache[self._job]           
  self._cond.acquire ()             try:                 self._ready  = true                  self._cond.notifY ()             finally:                  self._cond.release ()      else:         self._success = false          self._value = result          Del self._cache[self._job]         self._cond.acquire ()          try:              self._ready = true              self._cond.notify ()         finally:              self._cond.release ()




In the Mapresult class, _value saves the run result of Map_async, which is the same length as the length of the task parameter sequence for an element of none when initialized, and _chunksize indicates how many tasks are in each group after grouping the tasks, _number _left indicates how many groups the entire task sequence is divided into. The _handle_result thread saves the running results of the worker process to the _value through the _set method, so how do you fill in the correct location in the _value with the results of the worker process running, remember Map_async When the queue fills in a task, the I in each group, I represents the group number of the current task group, the _set method is based on the current task's group number, which is parameter I, and decrements the _number_left, when _number_left decrements to 0 o'clock, Indicates that all tasks in a sequence of task parameters have been processed by the Woker process, the _value are all calculated, and the conditional variables that are blocked on the Get method are awakened, and the client can obtain the results of the run.





The map function is a blocked version of the Map_async, which, on the basis of Map_async, calls the Get method, blocking directly to the result all returned:





def map (self, func, iterable, Chunksize=none):


Assert self._state = = RUN


Return Self.map_async (func, Iterable, chunksize). Get ()





In this section, we analyze two groups of interfaces that assign tasks to the process pool: Apply/apply_async and Map/map_async. The Apply method handles one task at a time, the execution method (callback function) of different tasks, and the parameters can be different, and the map method can process a sequence of tasks each time, and each task executes the same method.








Now let's look at what we can learn from the implementation of the process pool in the standard library. We know that the process pool is composed of multiple threads working with each other to provide reliable services to clients, so how do these threads do data sharing and synchronization? When a client uses the Apply/map function to assign a task to a process pool, the self._taskqueue is used to hold the task element, _taskqueue is defined as Queue.queue (), a thread-safe synchronization queue in a Python standard library. It guarantees that only one thread at the notification time adds or obtains elements from the queue to the queue. In this way, the main thread allocates tasks to the process pool (taskqueue.put), and _handle_tasks threads in the process pool read the elements in the _taskqueue queue, and two threads simultaneously operate taskqueue and do not affect each other. When there are n worker processes in the process pool waiting for the task to be issued, how can the _handle_tasks thread in the process pool read the task and ensure that a task is not captured by multiple worker processes? Let's look at how the _handle_tasks thread gives the worker process after it reads the task:





For taskseq, set_length in iter (taskqueue.get, none):     i =  -1     for i, task in enumerate (taskseq):          if thread._state:              debug (' Task handler found thread._state != run ')     
        break         try:             put (Task)          except exception as e:              job, ind = task[:2]              try:                  cache[Job]._set (ind,  (false, e))              Except keyerror:                  pass     else:         if set_ Length:             debug (' doing set_length () ') )             set_length (i+1)          continue     break Else:     debug (' Task  handler got sentinel ')


After get to the task from Taskqueue, the put function is invoked for each task in the assignment, which actually puts the task into the pipeline, and the interaction between the main process and the worker process is done through the pipe.


Let's look at the definition of the worker process:


W = self. Process (Target=worker,


Args= (Self._inqueue, Self._outqueue,


Self._initializer,


Self._initargs, Self._maxtasksperchild)


)


Where Self._inqueue and Self._outqueue are Simplequeue () objects, which are actually locked pipes, the put function called by the _handle_task thread, is the method of the Simplequeue object. We see that the worker process definitions are the same here, so the worker process in the process pool shares the Self._inqueue and Self._outqueue objects, so when a task element is put into a shared _inqueue pipe, How to make sure that only one worker gets it, the answer is also lock, in the definition of the Simplequeue () class, Put and get methods are locked, synchronized, and the only difference is that the locks are used to synchronize between processes. This ensures that the tasks are synchronized between multiple worker. Similar to the assignment task, when the worker process finishes running, the result is put to _outqueue,_outqueue the same Simplequeue class object that can be mutually exclusive between multiple processes.











After the worker process has finished running, the execution results are piped back into the process pool with the _handle_result thread responsible for receiving result, and after being removed, the results are written back to the Applyresult/mapresult object by calling the _set method. The client can fetch the result through the Get method, where it synchronizes by using a condition variable, and wakes up the main process blocking the Get function through the conditional variable after the _set function executes.





Process pool termination is accomplished by calling Pool.terminate (), where the implementation is ingenious, with a callable object that registers the callback function needed to terminate the pool, and then calls the object directly when it needs to be terminated.








Self._terminate = Finalize (


Self, Self._terminate_pool,


Args= (Self._taskqueue, Self._inqueue, Self._outqueue, Self._pool,


Self._worker_handler, Self._task_handler,


Self._result_handler, Self._cache),


Exitpriority=15


)


The __call__ method is implemented in the Finalize class, and when the Self._terminate () is run, the Self._terminate_pool object passed in to construct the self._terminate is invoked.





The builder expression is used when you use the Map/map_async function to bulk assign tasks to a pool of processes:





Self._taskqueue.put (((Result._job, I, Mapstar, (x,), {}) for I, X in enumerate (task_batches)), None)


The generator expression is simple, just replace the list resolution with (), and the list resolution of the expression above is expressed as:


[(Result._job, I, Mapstar, (x,), {}) for I, X in Enumerate (task_batches)]


The advantage of using generator expressions here is that it is equivalent to the expansion of list parsing, is good for memory, because it only generates a generator, when we need to use the generator corresponding to the logical target data, it will be generated by the logic of the database, so it is not a lot of memory.





In pool, the _worker_handler thread is responsible for monitoring, creating a new worker process, and removing the exiting process from the process pool while the monitoring worker process exits. This is similar to deleting a list while iterating over one side. Let's look at the implementation of the following code:








>>> L = [1, 2, 3, 3, 4, 4, 4, 5]


>>> for I in L:


If I in [3, 4, 5]:


L.remove (i)








>>> L


[1, 2, 3, 4, 5]











We see that L didn't remove all 3 and 4, because the remove changed the size of L. And look at the following implementation:








>>> L = [1, 2, 3, 3, 4, 4, 4, 5]


>>> for I in Range (Len (l)):


If L[i] in [3, 4]:


Del L[i]











Traceback (most recent call last):


File "<pyshell#37>", line 2, in <module>


If L[i] in [3, 4]:


Indexerror:list index out of range


>>>











Also because of the size change of L when Del L[i], continued access leads to access across borders. The process pool in the standard library gives a correct example of traversal removal:





For I in reversed (range (len (self._pool)):


Worker = Self._pool[i]


If Worker.exitcode is not None:


Worker.join ()


cleaned = True


Del Self._pool[i]





Use reversed to remove elements from the list from the back, which guarantees that all elements that qualify for deletion are removed:








>>> L = [1, 2, 3, 3, 4, 4, 4, 5]


>>> for I in Reversed (range (Len (l))):


If L[i] in [3, 4, 5]:


Del L[i]








>>> L


[1, 2]











Can be seen, a space is not large pool module, there are a lot of places worth learning. It's a good choice for Python or other languages to improve your skills and read the code in the standard library. For us to use, but do not know the secret of the source code, read more source, to understand its technical implementation, like Houtie that "STL source analysis" in the said, the source code, no secret. More importantly, these beautiful and efficient coding methods used in their own work, so that their own code can be like the code in the standard library as elegant, which can be said to be the pursuit of every developer.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.