Concurrent Python programming thread pool/process pool

Source: Internet
Author: User
Introduction The Python Standard Library provides the threading and multiprocessing modules for us to write the corresponding multi-threaded multi-process code. However, when the project reaches a certain scale, frequent creation and destruction of processes or threads consume a lot of resources, at this time, we need to write our own thread pool process pool to change the space for time. But starting from Python3.2, the standard library provides us with the concurrent. futures module introduction.

The Python Standard Library provides the threading and multiprocessing modules for us to write the corresponding multi-threaded/multi-process code, but when the project reaches a certain scale, frequent creation/destruction of processes or threads consume resources. at this time, we need to write our own thread pool/process pool for time. But from Python3.2, the standard library provides usConcurrent. futuresModule. It provides two classes: ThreadPoolExecutor and ProcessPoolExecutor. it further abstracts threading and multiprocessing, and provides direct support for writing thread pools and process pools.

Executor and Future

The foundation of the concurrent. futures module isExectuorExecutor is an abstract class and cannot be used directly. However, it provides two subclasses, ThreadPoolExecutor and ProcessPoolExecutor, which are very useful. as the name suggests, the two are used to create code for thread pools and process pools respectively. We can put the corresponding tasks directly into the thread pool/process pool, without the need to maintain the Queue to worry about the deadlock issue, the thread pool/process pool will automatically help US schedule.

FutureThis concept is believed to be familiar to anyone with programming experience in java and nodejs,You can understand it as an operation completed in the future.This is the basis of asynchronous programming. in traditional programming modes, for example, we operate queue. during get, blocking will occur before waiting for the returned results, and the cpu cannot let it out to do other things. The introduction of Future helps us to complete other operations during the waiting period. For more information about asynchronous IO in Python, see the coroutine/asynchronous IO of concurrent programming in Python.

P.s: If you are still sticking to Python2.x, install the Ures ures module first.

Pip install futures

Use submit to operate the thread pool/process pool

We will first use the following code to understand the concept of a thread pool.

# Example1.py
From concurrent. futures import ThreadPoolExecutor
Import time
Def return_future_result (message ):
Time. sleep (2)
Return message
Pool = ThreadPoolExecutor (max_workers = 2) # create a thread pool that can accommodate up to 2 tasks
Future1 = pool. submit (return_future_result, ("hello") # Add a task to the thread pool
Future2 = pool. submit (return_future_result, ("world") # Add a task to the thread pool
Print (future1.done () # determine whether Task 1 is over
Time. sleep (3)
Print (future2.done () # determine whether Task 2 is over
Print (future1.result () # view the results returned by Task 1
Print (future2.result () # view the result returned by task2
Let's analyze it based on the running results. We use the submit method to add a task to the thread pool, and submit returns a Future object. for a Future object, we can simply consider it as a Future operation. In the first print statement, it is obvious that time. sleep (2) is not completed because we use time. sleep (3) suspends the main thread, so by the second print statement, all tasks in our thread pool have been completed.

Ziwenxie ::~ » Python example1.py
False
True
Hello
World
# During the execution of the above program, we can see that the three threads run simultaneously in the background through the ps command
Ziwenxie ::~ » Ps-eLf | grep python
Ziwenxie 8361 7557 8361 3 3 00:00:00 pts/0 python example1.py
Ziwenxie 8361 7557 8362 0 3 00:00:00 pts/0 python example1.py
Ziwenxie 8361 7557 8363 0 3 00:00:00 pts/0 python example1.py
The above code can also be changed to the process pool form. the api and thread pool are exactly the same, so I won't be arrogant.

# Example2.py
From concurrent. futures import ProcessPoolExecutor
Import time
Def return_future_result (message ):
Time. sleep (2)
Return message
Pool = ProcessPoolExecutor (max_workers = 2)
Future1 = pool. submit (return_future_result, ("hello "))
Future2 = pool. submit (return_future_result, ("world "))
Print (future1.done ())
Time. sleep (3)
Print (future2.done ())
Print (future1.result ())
Print (future2.result ())
The running result is as follows:

Ziwenxie ::~ » Python example2.py
False
True
Hello
World
Ziwenxie ::~ » Ps-eLf | grep python
Ziwenxie 8560 7557 8560 3 3 00:00:00 pts/0 python example2.py
Ziwenxie 8560 7557 8563 0 3 00:00:00 pts/0 python example2.py
Ziwenxie 8560 7557 8564 0 3 00:00:00 pts/0 python example2.py
Ziwenxie 8561 8560 8561 0 1 00:00:00 pts/0 python example2.py
Ziwenxie 8562 8560 8562 0 1 00:00:00 pts/0 python example2.py

Use map/wait to operate thread pool/process pool

In addition to submit, Exectuor also provides the map method, which is similar to the built-in map usage. the following two examples are used to compare the differences between the two.

Review of submit operations

# Example3.py
Import concurrent. futures
Import urllib. request
URLS = ['http: // httpbin.org ', 'http: // example.com/', 'https: // api.github.com/']
Def load_url (url, timeout ):
With urllib. request. urlopen (url, timeout = timeout) as conn:
Return conn. read ()
# We can use a with statement to ensure threads are cleaned up promptly
With concurrent. futures. ThreadPoolExecutor (max_workers = 3) as executor:
# Start the load operations and mark each future with its URL
Future_to_url = {executor. submit (load_url, url, 60): url for url in URLS}
For future in concurrent. futures. as_completed (future_to_url ):
Url = future_to_url [future]
Try:
Data = future. result ()
Failed T Exception as exc:
Print ('% r generated an exception: % s' % (url, exc ))
Else:
Print ('% r page is % d bytes' % (url, len (data )))
From the running results, we can see that,As_completed is not returned in the order of the URLS list elements.

Ziwenxie ::~ » Python example3.py
'Http: // example.com/'page is 1270 byte
'Https: // api.github.com/'page is 2039 bytes
'Http: // httpbin.org 'page is 12150 bytes

Use map

# Example4.py
Import concurrent. futures
Import urllib. request
URLS = ['http: // httpbin.org ', 'http: // example.com/', 'https: // api.github.com/']
Def load_url (url ):
With urllib. request. urlopen (url, timeout = 60) as conn:
Return conn. read ()
# We can use a with statement to ensure threads are cleaned up promptly
With concurrent. futures. ThreadPoolExecutor (max_workers = 3) as executor:
For url, data in zip (URLS, executor. map (load_url, URLS )):
Print ('% r page is % d bytes' % (url, len (data )))
From the running results, we can see that,Map is returned in the order of elements in the URLS list.And the written code is more concise and intuitive. you can choose one of them based on your specific needs.

Ziwenxie ::~ » Python example4.py
'Http: // httpbin.org 'page is 12150 bytes
'Http: // example.com/'page is 1270 bytes
'Https: // api.github.com/'page is 2039 bytes

Third, select wait.

The wait method returns a tuple (tuples). The tuple contains two sets, one being completed (completed) and the other being uncompleted (unfinished ). One advantage of using the wait method is to obtain greater degrees of freedom. it receives three parameters FIRST_COMPLETED, FIRST_EXCEPTION, and ALL_COMPLETE. the default value is ALL_COMPLETED.

The following example shows the differences between the three parameters.

From concurrent. futures import ThreadPoolExecutor, wait, as_completed
From time import sleep
From random import randint
Def return_after_random_secs (num ):
Sleep (randint (1, 5 ))
Return "Return of {}". format (num)
Pool = ThreadPoolExecutor (5)
Futures = []
For x in range (5 ):
Futures. append (pool. submit (return_after_random_secs, x ))
Print (wait (futures ))
# Print (wait (futures, timeout = None, return_when = 'first _ completed '))
If the default ALL_COMPLETED is used, the program blocks until all tasks in the thread pool are completed.

Ziwenxie ::~ » Python example5.py
DoneAndNotDoneFutures (done = {
,
,
,
,
}, Not_done = set ())
If the FIRST_COMPLETED parameter is used, the program will not wait until all the tasks in the thread pool are completed.

Ziwenxie ::~ » Python example5.py
DoneAndNotDoneFutures (done = {
,
,
},
Not_done = { ,
})

Questions

Write a small program to compare the difference in execution efficiency between multiprocessing. pool (ThreadPool) and ProcessPollExecutor (ThreadPoolExecutor), and think about the cause of such a result based on the Future mentioned above.

The above is the details of the thread pool/process pool for Python concurrent programming. For more information, see other related articles in the first PHP community!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.