Introduction The Python Standard Library provides the threading and multiprocessing modules for us to write the corresponding multi-threaded multi-process code. However, when the project reaches a certain scale, frequent creation and destruction of processes or threads consume a lot of resources, at this time, we need to write our own thread pool process pool to change the space for time. However, from Python3.2, the standard library provides the concurrent. futures module, which provides two classes: ThreadPoolExecutor and ProcessPoolExecutor, implementing thr... introduction
The Python Standard Library provides the threading and multiprocessing modules for us to write the corresponding multi-threaded/multi-process code, but when the project reaches a certain scale, frequent creation/destruction of processes or threads consume resources. at this time, we need to write our own thread pool/process pool for time. But from Python3.2, the standard library provides usConcurrent. futuresModule. It provides two classes: ThreadPoolExecutor and ProcessPoolExecutor. it further abstracts threading and multiprocessing, and provides direct support for writing thread pools and process pools.
Executor and Future
The foundation of the concurrent. futures module isExectuorExecutor is an abstract class and cannot be used directly. However, it provides two subclasses, ThreadPoolExecutor and ProcessPoolExecutor, which are very useful. as the name suggests, the two are used to create code for thread pools and process pools respectively. We can put the corresponding tasks directly into the thread pool/process pool, without the need to maintain the Queue to worry about the deadlock issue, the thread pool/process pool will automatically help US schedule.
FutureThis concept is believed to be familiar to anyone with programming experience in java and nodejs,You can understand it as an operation completed in the future.This is the basis of asynchronous programming. in traditional programming modes, for example, we operate queue. during get, blocking will occur before waiting for the returned results, and the cpu cannot let it out to do other things. The introduction of Future helps us to complete other operations during the waiting period. For more information about asynchronous IO in Python, see the coroutine/asynchronous IO of concurrent programming in Python.
P.s: If you are still sticking to Python2.x, install the Ures ures module first.
pip install futures
Use submit to operate the thread pool/process pool
We will first use the following code to understand the concept of a thread pool.
# Example1.pyfrom concurrent. futures import ThreadPoolExecutorimport timedef return_future_result (message): time. sleep (2) return messagepool = ThreadPoolExecutor (max_workers = 2) # create a thread pool future1 = pool that can accommodate up to 2 tasks. submit (return_future_result, ("hello") # Add a taskfuture2 = pool to the thread pool. submit (return_future_result, ("world") # Add a taskprint (future1.done () to the thread pool # determine whether Task 1 ends at time. sleep (3) print (future2.done () # determine whether Task 2 ends print (future1.result () # view the result print (future2.result () returned by Task 1) # view the result returned by task 2
Let's analyze it based on the running results. We useSubmitMethod adds a task to the thread pool, and the submit returnsFuture objectThe Future object can be simply understood as an operation completed in the Future. In the first print statement, it is obvious that time. sleep (2) is not completed because we use time. sleep (3) suspends the main thread, so by the second print statement, all tasks in our thread pool have been completed.
Ziwenxie ::~ » Python example1.pyFalseTruehelloworld # during the execution of the preceding program, we can see that three threads simultaneously run ziwenxie ::~ » Ps-eLf | grep pythonziwenxie 8361 7557 8361 3 00:00:00 pts/0 00:00:00 python example1.pyziwenxie 8361 7557 8362 8361 0 3 pts/0 python example1.pyziwenxie 7557 8363 0 3 pts/0 00:00:00 python example1.py
The above code can also be changed to the process pool form. the api and thread pool are exactly the same, so I won't be arrogant.
# example2.pyfrom concurrent.futures import ProcessPoolExecutorimport timedef return_future_result(message): time.sleep(2) return messagepool = ProcessPoolExecutor(max_workers=2)future1 = pool.submit(return_future_result, ("hello"))future2 = pool.submit(return_future_result, ("world"))print(future1.done())time.sleep(3)print(future2.done())print(future1.result())print(future2.result())
The running result is as follows:
ziwenxie :: ~ » python example2.pyFalseTruehelloworldziwenxie :: ~ » ps -eLf | grep pythonziwenxie 8560 7557 8560 3 3 19:53 pts/0 00:00:00 python example2.pyziwenxie 8560 7557 8563 0 3 19:53 pts/0 00:00:00 python example2.pyziwenxie 8560 7557 8564 0 3 19:53 pts/0 00:00:00 python example2.pyziwenxie 8561 8560 8561 0 1 19:53 pts/0 00:00:00 python example2.pyziwenxie 8562 8560 8562 0 1 19:53 pts/0 00:00:00 python example2.py
Use map/wait to operate thread pool/process pool
In addition to submit, Exectuor also provides the map method, which is similar to the built-in map usage. the following two examples are used to compare the differences between the two.
Review of submit operations
# example3.pyimport concurrent.futuresimport urllib.requestURLS = ['http://httpbin.org', 'http://example.com/', 'https://api.github.com/']def load_url(url, timeout): with urllib.request.urlopen(url, timeout=timeout) as conn: return conn.read()# We can use a with statement to ensure threads are cleaned up promptlywith concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: # Start the load operations and mark each future with its URL future_to_url = {executor.submit(load_url, url, 60): url for url in URLS} for future in concurrent.futures.as_completed(future_to_url): url = future_to_url[future] try: data = future.result() except Exception as exc: print('%r generated an exception: %s' % (url, exc)) else: print('%r page is %d bytes' % (url, len(data)))
From the running results, we can see that,As_completed is not returned in the order of the URLS list elements.
ziwenxie :: ~ » python example3.py'http://example.com/' page is 1270 byte'https://api.github.com/' page is 2039 bytes'http://httpbin.org' page is 12150 bytes
Use map
# example4.pyimport concurrent.futuresimport urllib.requestURLS = ['http://httpbin.org', 'http://example.com/', 'https://api.github.com/']def load_url(url): with urllib.request.urlopen(url, timeout=60) as conn: return conn.read()# We can use a with statement to ensure threads are cleaned up promptlywith concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: for url, data in zip(URLS, executor.map(load_url, URLS)): print('%r page is %d bytes' % (url, len(data)))
From the running results, we can see that,Map is returned in the order of elements in the URLS list.And the written code is more concise and intuitive. you can choose one of them based on your specific needs.
ziwenxie :: ~ » python example4.py'http://httpbin.org' page is 12150 bytes'http://example.com/' page is 1270 bytes'https://api.github.com/' page is 2039 bytes
Third, select wait.
The wait method returns a tuple (tuples). The tuple contains two sets, one being completed (completed) and the other being uncompleted (unfinished ). One advantage of using the wait method is to obtain greater degrees of freedom. it receives three parameters FIRST_COMPLETED, FIRST_EXCEPTION, and ALL_COMPLETE. the default value is ALL_COMPLETED.
The following example shows the differences between the three parameters.
from concurrent.futures import ThreadPoolExecutor, wait, as_completedfrom time import sleepfrom random import randintdef return_after_random_secs(num): sleep(randint(1, 5)) return "Return of {}".format(num)pool = ThreadPoolExecutor(5)futures = []for x in range(5): futures.append(pool.submit(return_after_random_secs, x))print(wait(futures))# print(wait(futures, timeout=None, return_when='FIRST_COMPLETED'))
If the default ALL_COMPLETED is used, the program blocks until all tasks in the thread pool are completed.
ziwenxie :: ~ » python example5.pyDoneAndNotDoneFutures(done={
,
,
,
,
}, not_done=set())
If the FIRST_COMPLETED parameter is used, the program will not wait until all the tasks in the thread pool are completed.
ziwenxie :: ~ » python example5.pyDoneAndNotDoneFutures(done={
,
,
},not_done={
,
})
The above is a detailed introduction to the thread pool/process pool of Python concurrent programming. For more information, see other related articles in the first PHP community!