Cython 0.15, using OpenMP parallel multi-core to accelerate python!

Source: Internet
Author: User

Lai Yonghao (http://laiyonghao.com)

Note:
0. Read this articleArticleYou need to understand the basic usage of OpenMP.
1. Understand the basic concepts of Gil in this article.
2. It is basically the translation of this article.
3. This document is not a guide (or manual) to use cython. Parallel for information dissemination only.
4, I have translated an article "OpenMP and C ++: get twice the result with half the effort to obtain the benefits of multithreading" to help understand this article, see: On (http://blog.csdn.net/lanphaday/article/details/1503817), under (http://blog.csdn.net/lanphaday/article/details/1507834 ).

Cython. Parallel module is added to cython 0.15 to support native parallel programming. Currently, only OpenMP is supported. More backend support will be added later. It should be noted that the parallel operation runs in the environment where Gil is released.

Cython. Parallel. Prange ([start], stop [, step], nogil = false, schedule = none)

This function runs in parallel. OpenMP automatically creates a thread pool and assigns jobs to these threads according to the specified scheduling scheme. The step parameter cannot be 0. If the nogil parameter is true, the loop will be packaged in a nogil environment. The shedule parameter supports the scheduling mechanism defined in OpenMP, such as static, dynamic, guided, auto, and runtime.
Thread-locality and function are inferred from the variable. The variable assigned in the prange block is considered as lastprivate, meaning that the value of this variable is the value of the last iteration. If an in-situ operator is used for a variable, it will be treated as operation, meaning that each thread copies a private variable and then applies this operator after the loop ends, and assign the value to the original variable. The index variable is always lastprivate, and all the variables assigned in the parallel block are regarded as private, and cannot be used after the Parallel Block is left, because the final value of the variable cannot be determined. If you cannot understand the two paragraphs, you must read the OpenMP documentation ).
The following is an example of function:

 
From cython. Parallel Import Prange, parallel, threadidcdef int icdef int sum = 0for I in Prange (n, nogil = true): Sum + = iPrint sum

Here is an example of sharing the numpy array:

 
From cython. parallel import * def func (NP. ndarray [double] X, double alpha): cdef py_ssize_t I for I in Prange (X. shape [0]): X [I] = Alpha * X [I]

Cython. Parallel. Parallel ()

You can use this command in the with statement to implementCodeParallel Execution of sequences. This is useful when preparing a thread-local buffer for Prange. The contained Prange will become a non-parallel work sharing loop, so all variables assigned in the parallel section are also private in Prange. Private variables in all parallel blocks are unavailable after the parallel blocks are left.
Example of thread-Local Buffer:

 
From cython. parallel import * From libc. stdlib cimport abort, malloc, freecdef py_ssize_t idx, I, n = 100 cdef int * local_bufcdef size_t size = 10 with nogil, parallel (): local_buf = <int *> malloc (sizeof (INT) * size) If local_buf = NULL: Abort () # populate our local buffer in a sequential loop for idx in range (size): local_buf [I] = I * 2 # share the work using the thread-local buffer (s) for I in Prange (n, schedule = 'guided'): func (local_buf) free (local_buf)

In the future, sections will support parallel blocks, so that the code of sections can be allocated to multiple threads for execution.

Cython. Parallel. threadid ()

The ID of the returned thread. For n threads, their ID ranges from 0 to n ).

Compile

To enable OpenMP support, open the OpenMP switch of the C or C ++ compiler. The setup. py applicable to GCC is as follows:

 
From distutils. core import setupfrom distutils. extension import extensionfrom cython. distutils import build_extext_module = extension ("hello", ["hello. pyx "], extra_compile_args = ['-fopenmp'], extra_link_args = ['-fopenmp'],) setup (name = 'Hello world app ', using class = {'build _ ext ': build_ext}, ext_modules = [ext_module],)

Interrupt

Parallel with and Prange blocks in nogil mode support break, continue, and return. In addition, with Gil blocks can be used in these blocks, and exceptions can also be thrown. However, because OpenMP is used, it is better to quit because it cannot be skipped.Program. Take Prange () as an example. After the first return, break, or exception is thrown, every loop of all threads will be skipped. Therefore, if there are multiple values that should be returned, which value should be returned is not defined, because the iteration itself has no specific order:

From cython. parallel import prangecdef int func (py_ssize_t N): cdef py_ssize_t I for I in Prange (n, nogil = true): If I = 8: With Gil: Raise exception () elif I = 4: Break Elif I = 2: return I

In the above example, whether to throw an exception or simply break or return 2 is not defined (uncertain ).

Nested Parallelism

Due to a bug in GCC, nested parallelism is disabled. However, you can call functions containing parallel segments in a parallel segment.

References

[1] http://www.openmp.org/mp-documents/spec30.pdf
[2] http://gcc.gnu.org/bugzilla/show_bug.cgi? Id = 49897

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.