As we all know, Python's parallel processing capability is not ideal. I think that if we don't consider the standard parameters of threads and Gil (most of them are legitimate), the reason is not that the technology is not in place, but that we are not using the appropriate method. Most of the textbooks on Python threads and multiple processes are excellent, but the content is tedious. They do have a lot of useful information at the beginning, but often they don't involve parts that really improve the day-to-day work.
Classic examples
Popular search results on DDG with the keyword "python threading tutorial (Python threading Tutorial)" show that the examples given in almost every article are the same class + queue.
In fact, they are the following code examples that use Producer/consumer to process threads/multiple processes:
#Example. Py ' ' Standard producer/consumer Threading pattern ' "Import time import Threading import Queue class Cons Umer (Threading. Thread): Def __init__ (self, queue): Threading. Thread.__init__ (self) self._queue = Queue def run (self): while True: # queue.get () blocks the "current th"
Read until # an item is retrieved. msg = Self._queue.get () # Checks If is # the ' Poison pill ' if Isinstance (msg, str) a nd msg = = ' quit ': # If so, exists the loop break # "processes" (or in our own case, prints) the queue it
EM print "I ' m a thread, and I received%s!!"% msg # Always be friendly!
print ' Bye byes! '
Def Producer (): # Queue is used to share items between # the threads. Queue = Queue.queue () # Create An instance of the worker worker = Consumer (queue) # Start calls the internal run () method to # Kick off the Thread Worker.start () # variable to keep track of when weStarted start_time = Time.time () # while under 5 seconds. While Time.time ()-Start_time < 5: # "Produce" a piece of work and stick it in # The queue for the Consumer to Process Queue.put (' Something at%s '% Time.time ()) # Sleep a bit just to avoid a absurd number of messages T
Ime.sleep (1) # this ' poison pill ' method of killing a thread.
Queue.put (' Quit ') # Wait for the thread-close down Worker.join () if __name__ = = ' __main__ ': Producer ()
Well ... it feels a bit like java.
I don't want to say now that using Producer/consume to solve threads/multiple processes is wrong--because it's definitely true, and in many cases it's the best way. But I don't think it's the best choice for writing code at ordinary times.
It's the problem (personal opinion)
First, you need to create a template-type bedding class. You then create a queue that completes the task through its delivery object and the two ends of the regulatory queue. (If you want to implement data exchange or storage, it usually involves another queue's participation).
The more you have more worker, the more problems you have.
Next, you should create a pool of worker classes to improve Python's speed. The following is a better approach given by IBM Tutorial. This is also a common way for the program to retrieve Web pages using multithreading.
#Example2. Py ' A more realistic thread pool example ' "Import time import Threading import Queue import URLLIB2 cl Ass Consumer (Threading. Thread): Def __init__ (self, queue): Threading. Thread.__init__ (self) self._queue = Queue def run (self): while true:content = Self._queue.get () i F isinstance (content, str) and content = = ' quit ': Break response = urllib2.urlopen (content) print ' Bye b
Yes! ' Def Producer (): urls = [' http://www.python.org ', ' http://www.yahoo.com ', ' http://www.scala.org ', ' Http://www.goo
Gle.com ' # etc.. ] Queue = Queue.queue () worker_threads = Build_worker_pool (queue, 4) start_time = Time.time () # ADD the URL to Process for URL in urls:queue.put (URL) # ADD the poison pillv to worker in Worker_threads:queue.put (' qui T ') for worker in Worker_threads:worker.join () print ' done! Time taken: {} '. Format (Time.time ()-start_time) def build_worker_pool (queue, size): WoRkers = [] for _ in range (size): worker = Consumer (queue) Worker.start () workers.append (worker) return wor
Kers if __name__ = = ' __main__ ': Producer ()
It does work, but how complicated the code is! It includes initialization methods, thread-tracking lists, and nightmares of people who are just as vulnerable to the deadlock problem as I am-a lot of join statements. And these are just a tedious start!
What have we accomplished so far? There's basically nothing. The code above is almost always just passing through. It's a very basic method, it's easy to make mistakes (damn, I just forgot to call the Task_done () method on the queue object (but I'm too lazy to modify it), it's very low cost. Fortunately, we still have a better way.
Introduction: Map
Map is a great little feature, and it is also the key to the fast running of Python parallel code. Explain to unfamiliar people that map comes from the function language Lisp. The map function can map another function in sequence. For example
URLs = [' http://www.yahoo.com ', ' http://www.reddit.com ']
results = map (urllib2.urlopen, URLs)
This calls the Urlopen method to return all the call results in sequence and store them in a list. It's like:
results = [] for
URL in URLs:
results.append (Urllib2.urlopen (URL))
The map processes these iterations in sequence. Call this function and it returns us a simple list of the results that are stored in sequence.
Why is it so powerful? As long as you have the right library, map can make the parallel run very smooth!
There are two libraries that can support the completion of parallel through the map function: One is multiprocessing, the other is a little-known but powerful subdocument: Multiprocessing.dummy.
Digression: What is this? You've never heard of the dummy library? I have only recently known. It is only mentioned in the documentation of the multiple processes. And that's probably how you know there's a thing. I dare say the consequences of this near-selling are unimaginable!
Dummy is the clone file of the multi-process module. The only difference is that the multi-process module uses processes, while dummy uses threads (of course, it has all the common Python limitations). That is, the data is passed from one to the other. This makes it easy for data to move forward and back between the two, especially for exploratory programs, because you don't have to determine whether the framework call is IO or CPU mode.
Ready to start
To accomplish parallelism through the map function, you should first import the module that contains them:
From multiprocessing import pool from
multiprocessing.dummy import pool as ThreadPool
Re-initialize:
This simple sentence can replace all the work of our Build_worker_pool function in example2.py. In other words, it creates many effective worker, launches them to prepare for the next work, and stores them in different locations for ease of use.
The pool object requires some parameters, but most importantly: the process. It determines the number of worker in pool. If you don't fill it out, it defaults to the kernel value of your computer.
If you use a multi-process pool in CPU mode, the faster the kernel number is (there are many other factors). However, when it comes to threading or working with network bindings, the situation is more complex, so you should use the exact size of the pool.
Pool = ThreadPool (4) # Sets the pool size to 4
If you run too many threads, the switch between multithreading will waste a lot of time, so you'd better be patient to debug the most appropriate number of tasks.
Now that we've created the pool object, we'll have a simple parallel program, so let's example2.py the URL in the opener!
Import Urllib2
from multiprocessing.dummy import Pool as ThreadPool
urls = [
' http://www.python.org ',
' http://www.python.org/about/',
' http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html ',
' http://www.python.org/doc/',
' http://www.python.org/download/', '
http://www.python.org/getit/',
' http://www.python.org/community/',
' https://wiki.python.org/moin/', ' http://planet.python.org/',
' https://wiki.python.org/moin/LocalUserGroups ',
' http://www.python.org/psf/', '
http:// docs.python.org/devguide/',
' http://www.python.org/community/awards/'
# etc ...
]
# Make the pool of workers
pool = ThreadPool (4)
# Open The URL in their own threads # and return the
Resul TS
results = pool.map (urllib2.urlopen, URLs) #close the pool and wait for the
work to finish
Pool.close ()
pool.join ()
Watch it! This time the code has done all the work in just 4 lines. One or 3 sentences are simply fixed. Call map to complete the 40 lines in our previous example! To give a more vivid indication of the difference between the two methods, I also timed the time to run them separately.
# results = []
# for URL in URLs:
# result = Urllib2.urlopen (URL)
# Results.append (Result)
# -----versus-------# # #
-------4 pool-------#
pool = ThreadPool (4)
# results = Pool.map (urllib2.urlope N, URLs)
# #-------8 pool-------#
pool = ThreadPool (8)
# results = Pool.map (urllib2.urlopen, URLs)
# #-------Pool-------#
pool = ThreadPool
# results = Pool.map (Urllib2.urlopen, URLs)
Results:
# single thread:14.4 Seconds
# 4 pool: 3.1 Seconds
# 8 pool: 1.4 Seconds
# Pool: 1.3 Seconds
Pretty good! It also shows why you should carefully debug the pool size. Here, as long as more than 9, you can make it faster.
Example 2:
Generate thousands of thumbnails
Let's do it in CPU mode! I often have to deal with a lot of image folders in my job. One of its tasks is to create thumbnails. This already has a very mature method in the parallel task.
A single thread creation of the base
Import OS
import pil from
multiprocessing import Pool from
pil import Image
SIZE = (75,75)
save_ DIRECTORY = ' thumbs '
def get_image_paths (folder): Return
(Os.path.join (folder, F) to
F in Os.listdir ( folder)
if ' jpeg ' in F '
def create_thumbnail (filename):
im = Image.open (filename)
im.thumbnail ( SIZE, Image.antialias)
base, fname = Os.path.split (filename)
Save_path = Os.path.join (base, Save_directory, fname)
Im.save (save_path)
if __name__ = = ' __main__ ':
folder = Os.path.abspath (
' 11_18_2013_r000 _iqm_big_sur_mon__e10d1958e7b766c3e840 ')
Os.mkdir (Os.path.join (folder, save_directory))
images = get_ Image_paths (folder) for
image in images:
create_thumbnail (image)
For an example, this is a bit difficult, but essentially, this is to pass a folder to the program, and then grab all the pictures out of it and eventually create and store thumbnails in their respective directories.
My computer handles about 6000 pictures in 27.9 seconds.
If we use a parallel call to map instead of a For loop:
Import OS
import pil from
multiprocessing import Pool from
pil import Image
SIZE = (75,75)
save_ DIRECTORY = ' thumbs '
def get_image_paths (folder): Return
(Os.path.join (folder, F) to
F in Os.listdir ( folder)
if ' jpeg ' in F '
def create_thumbnail (filename):
im = Image.open (filename)
im.thumbnail ( SIZE, Image.antialias)
base, fname = Os.path.split (filename)
Save_path = Os.path.join (base, Save_directory, fname)
Im.save (save_path)
if __name__ = = ' __main__ ':
folder = Os.path.abspath (
' 11_18_2013_r000 _iqm_big_sur_mon__e10d1958e7b766c3e840 ')
Os.mkdir (Os.path.join (folder, save_directory))
images = get_ Image_paths (folder)
pool = Pool ()
Pool.map (create_thumbnail,images)
pool.close ()
pool.join ()
5.6 Seconds!
For only a few lines of code to change, this is greatly increased the speed of the operation. This approach can be faster, as long as you run the CPU and IO tasks separately with their processes and threads--but it also often causes deadlocks. In short, considering the practical function of map and the lack of human thread management, I think it is a beautiful, reliable and easy to debug method.
OK, the article is over. A row completes a parallel task.