Python multi-core parallel computing sample code and python sample code

Source: Internet
Author: User
Tags theano

Python multi-core parallel computing sample code and python sample code

In the past, I wrote a small program, but I didn't care about parallelism at all. I had no problem with single-core running, and my computer only had four dual-core hyper-threading threads (hereinafter referred to as the kernel ), it doesn't make sense to go about concurrency (unless I/O-intensive tasks are being performed ). Since the 32-core GB memory is used, we can see a pile of no-load cores in htop. Naturally, we will think that this parallel operation must be done. Later I found that Python's parallel processing is actually very simple.

Multiprocessing vs threading

Python comes with a full and easy-to-use library, which is one of the reasons why I especially like Python. Python contains the multiprocessing and threading libraries for implementing parallel processing. It should be a natural idea to use threads. After all (intuitively) there are low overhead and shared memory benefits, and thread usage in other languages is indeed very frequent. However, I can say responsibly that if you are using CPython implementation, then using threading is equivalent to saying goodbye to parallel computing (in fact, even slower than a single thread), unless it is an IO-intensive task.

GIL

CPython refers to the Python implementation provided by python.org. Yes, Python is a language with different implementations, such as PyPy, Jython, IronPython, etc ...... CPython is the most used. It almost equals Python.

CPython uses GIL, the global lock, to simplify the implementation of the interpreter, so that the interpreter only executes the bytecode in one thread at a time. That is to say, unless I/O operations are waiting, The multithreading of CPython is a complete lie!

The following two documents about GIL are well written:

  1. Http://cenalulu.github.io/python/gil-in-python/
  2. Http://www.dabeaz.com/python/UnderstandingGIL.pdf

Multiprocessing. Pool

Threading cannot be used because of GIL, so we should study multiprocessing well. (Of course, if you say you don't need CPython and you don't have a GIL problem, it's also great .)

First, we will introduce a simple, crude, and very practical tool, namely multiprocessing. Pool. If your task can be solved using ys = map (f, xs), we may all know that such a form is inherently the easiest to be parallel, in Python, parallel computing is really easy. For example, set each number to square:

import multiprocessingdef f(x):  return x * xcores = multiprocessing.cpu_count()pool = multiprocessing.Pool(processes=cores)xs = range(5)# method 1: mapprint pool.map(f, xs) # prints [0, 1, 4, 9, 16]# method 2: imapfor y in pool.imap(f, xs):  print y      # 0, 1, 4, 9, 16, respectively# method 3: imap_unorderedfor y in pool.imap_unordered(f, xs):  print(y)      # may be in any order

Map directly returns the list, while the two functions starting with I return the iterator; The imap_unordered returns unordered.

When the computing time is relatively long, we may want to add a progress bar, which will reflect the benefits of series I. In addition, there is a trick: Output \ r can make the cursor return to the beginning of the line without line breaks, so that you can make a simple progress bar.

cnt = 0for _ in pool.imap_unordered(f, xs):  sys.stdout.write('done %d/%d\r' % (cnt, len(xs)))  cnt += 1

More complex operations

To perform more complex operations, you can directly use the multiprocessing. Process object. To achieve inter-process communication, you can use:

  1. Multiprocessing. Pipe
  2. Multiprocessing. Queue
  3. Synchronization primitive
  4. Shared variable

Among them, I strongly recommend Queue, because in many scenarios, the producer and consumer model is used, and Queue is used to solve the problem. The method used is also very simple. Now the parent Process creates a Queue and then transmits it as an args or kwargs to the Process.

Precautions for using tools such as Theano or Tensorflow

Note that some side effects may occur when the Cuda tool is called, such as import theano or import tensorflow. these side effects will be copied to the sub-process as they are, and then an error occurs, for example:

Cocould not retrieve CUDA device count: CUDA_ERROR_NOT_INITIALIZED

The solution is to ensure that the parent process does not introduce these tools. Instead, after the child process is created, the child process is introduced separately.

If Process is used, import in the target function. For example:

import multiprocessingdef hello(taskq, resultq):  import tensorflow as tf  config = tf.ConfigProto()  config.gpu_options.allow_growth=True  sess = tf.Session(config=config)  while True:    name = taskq.get()    res = sess.run(tf.constant('hello ' + name))    resultq.put(res)if __name__ == '__main__':  taskq = multiprocessing.Queue()  resultq = multiprocessing.Queue()  p = multiprocessing.Process(target=hello, args=(taskq, resultq))  p.start()  taskq.put('world')  taskq.put('abcdabcd987')  taskq.close()  print(resultq.get())  print(resultq.get())  p.terminate()  p.join()

If a Pool is used, you can compile a function, import it in the function, and pass this function into the Pool constructor as an initializer. For example:

import multiprocessingdef init():  global tf  global sess  import tensorflow as tf  config = tf.ConfigProto()  config.gpu_options.allow_growth=True  sess = tf.Session(config=config)def hello(name):  return sess.run(tf.constant('hello ' + name))if __name__ == '__main__':  pool = multiprocessing.Pool(processes=2, initializer=init)  xs = ['world', 'abcdabcd987', 'Lequn Chen']  print pool.map(hello, xs)

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.