Learn OpenMP (3) together -- basic usage of for loop parallelization

Source: Internet
Author: User

I. Introduction

In "Learning OpenMP together (1) -- Initial Experience", an example of for loop parallelization is provided. Further analysis is made here, however, this section only describes the basic usage of for loop parallelization (that is, the # pragma OMP parallel for Preprocessor indicator), which must meet the data non-relevance requirements.

 

Ii. data relevance

During loop parallelization, since multiple threads execute a loop at the same time, the iteration order is uncertain. If the data is irrelevant, you can use the basic # pragma OMP parallel for Preprocessor indicator.

If statement S2 is related to statement S1, there must be one of the following two situations:

1. Statement S1 accesses the storage unit L in one iteration, while S2 accesses the Unified Storage unit in the subsequent iteration, which is called loop-carried dependence );

2. S1 and S2 access the Unified Storage unit L in the same loop iteration, but the execution of S1 is prior to S2, which is called loop-independent dependence ).

 

Iii. Several declaration forms of for loop parallelization

# Include <iostream> <br/> # include <OMP. h> // header file to be included in OpenMP programming </P> <p> int main () <br/>{< br/> // For Loop parallelization Declaration Form 1 <br/> # pragma OMP parallel <br/>{< br/> # pragma OMP for <br /> for (INT I = 0; I <10; ++ I) <br/>{< br/> STD: cout <I <STD: Endl; <br/>}</P> <p> // For Loop parallel Declaration Form 2 <br/> # pragma OMP parallel for <br/> (int J = 0; j <10; ++ J) <br/>{< br/> STD: cout <j <STD: Endl; <br/>}< br/> return 0; <br/>}

The two declaration forms of the above Code are the same. Obviously, the second declaration form is more concise and compact.

However, the first declarative form has the advantage that you can write other parallel code in the parallel area and outside of the for loop.

For example:

# Include <iostream> <br/> # include <OMP. h> // header file to be included in OpenMP programming </P> <p> int main () <br/>{< br/> // For Loop parallel Declaration Form 1 <br/> # pragma OMP parallel <br/>{< br/> STD :: cout <"OK" <STD: Endl; <br/> # pragma OMP for <br/> for (INT I = 0; I <10; ++ I) <br/>{< br/> STD: cout <I <STD: Endl; <br/>}</P> <p> // For Loop parallel Declaration Form 2 <br/> # pragma OMP parallel for <br/> /* STD:: cout <"error" <STD: Endl; */<br/> for (Int J = 0; j <10; ++ J) <br/>{< br/> STD: cout <j <STD: Endl; <br/>}< br/> return 0; <br/>}

 

Iv. Constraints for loop parallelization

Although OpenMP can easily parallelize for loops, not all for loops can be parallelized. Parallelism is not allowed in the following situations:

1. The cyclic variable in the for loop must be a signed integer. For example, for (unsigned int I = 0; I <10; ++ I) {} will not be compiled;

2. In a for loop, the comparison operator must be <, <=,>,> =. For example, for (INT I = 0; I! = 10; ++ I) {} compilation fails;

3. The third expression in the for loop must be the addition and subtraction of integers, and the addition and subtraction value must be a circular invariant. For example, for (INT I = 0; I! = 10; I = I + 1) {} will not be compiled; it feels that only ++ I; I ++; -- I; or I --;

4. If the comparison operation in the for loop is <or <=, the loop variable can only be increased, and vice versa. For example, for (INT I = 0; I! = 10; -- I) Compilation fails;

5. The loop must be a single entry or exit. That is to say, a jump statement outside the loop cannot be reached within the loop, except exit. Exception Handling must also be done in a loop. For example, if the break or goto in the cyclic body jumps to the cyclic body, the compilation fails.

 

5. Example of basic for loop parallelization

# Include <iostream> <br/> # include <OMP. h> // header file to be included in OpenMP programming </P> <p> int main () <br/>{< br/> int A [10] = {1 }; <br/> int B [10] = {2}; <br/> int C [10] = {0 }; </P> <p> # pragma OMP parallel <br/> {<br/> # pragma OMP for <br/> for (INT I = 0; I <10; ++ I) <br/> {<br/> // C [I] is only related to a [I] and B [I] <br/> C [I] = A [I] + B [I]; <br/>}</P> <p> return 0; <br/>}

 

 

6. nested For Loop

# Include <iostream> <br/> # include <OMP. h> // header file to be included in OpenMP programming </P> <p> int main () <br/>{< br/> int A [10] [5] = {1}; <br/> int B [10] [5] = {2 }; <br/> int C [10] [5] = {0 }; </P> <p> # pragma OMP parallel <br/> {<br/> # pragma OMP for <br/> for (INT I = 0; I <10; ++ I) <br/>{< br/> for (Int J = 0; j <5; ++ J) <br/> {<br/> // C [I] [J] is only related to a [I] [J] and B [I] [J] <br/> c [I] [J] = A [I] [J] + B [I] [J]; <br/>}</P> <p> return 0; <br/>}

The compiler will complete the first CPU:

For (INT I = 0; I <5; ++ I) <br/> {<br/> for (Int J = 0; j <5; ++ J) <br/> {<br/> // C [I] [J] is only related to a [I] [J] and B [I] [J] <br/> c [I] [J] = A [I] [J] + B [I] [J]; <br/>}< br/>}

The second CPU will be completed:

For (INT I = 5; I <10; ++ I) <br/> {<br/> for (Int J = 0; j <5; ++ J) <br/> {<br/> // C [I] [J] is only related to a [I] [J] and B [I] [J] <br/> c [I] [J] = A [I] [J] + B [I] [J]; <br/>}< br/>}

 

 

VII. Section

This section describes the concepts related to data, describes the basic for loop parallelization method, and points out that the # pragma OMP parallel for preprocessing indicator cannot be used. In the future, we will describe other types of for loop parallelization during data competition.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.