Detailed description of Data Attribute-related clauses in OpenMP (1): Comparison between private, firstprivate, lastprivate, and threadprivate

Source: Internet
Author: User
Private/firstprivate/lastprivate/threadprivate. First, you must know that they are divided into two categories: Private/firstprivate/lastprivate clauses, and threadprivate clauses, which are instructions. (PS: in some cases, threadprivate is a sub-statement, but actually it is an instruction .)
For more information, see.

(1) Private
The private clause declares one or more variables as the private variables of the thread. Each thread has its own private copy of variables, which cannot be accessed by other threads. Even if a shared variable with the same name exists in the parallel area, the shared variable does not play any role in the parallel area, and the shared variable is not operated in the parallel area.
Note:

1. Private variables areUndefinedOn entry and exit of the parallel region. That is, the private variable isEnterAndExitThe parallel area is"Undefined.

2. The value of the original variable (before the parallel region) is undefined after the parallel region! The original variables defined before the parallel zone are also defined after the parallel Zone"Undefined.

3. A private variable within the parallel region has no storage association with the same variable outside of the region. Private variables in the parallel region and variables with the same name outside the parallel region are not stored.

Note: Private is easy to understand. The following is an example of the above considerations,

A. The private variable is "undefined" in the inbound and outbound parallel areas.

int main(int argc, _TCHAR* argv[]){int A=100;#pragma omp parallel for private(A)for(int i = 0; i<10;i++){printf("%d\n",A);}return 0;}

It is easy for beginners to think that this code is correct. In fact, when a enters the parallel area, it is undefined. Therefore, if a directly performs read operations in the parallel area, a running error will occur.

In fact, compile this code in VS with a warning:

Warning c4700: uninitialized local variable 'A' used

A clearly points to the "printf" sentence. A is not an initialized variable. Therefore, a crash occurs during running.

This Code shows that private is undefined when entering the parallel area. It is not easy to explain how to exit the parallel area. In itself, the three considerations here are cross-understanding, it indicates a meaning, so let's take a look at the example below to understand it.

B. The original variables defined before the parallel zone are also defined after the parallel Zone"Undefined.

int main(int argc, _TCHAR* argv[]){int B;#pragma omp parallel for private(B)for(int i = 0; i<10;i++){B = 100; }printf("%d\n",B);return 0;}

Here, B assigns values in the parallel area, but after exiting the parallel area, it is undefined. Understand "the original variables defined before the parallel zone are also defined after the parallel Zone"Undefined"When using this sentence, you should note that it is not to say that all the original variables defined in the parallel area are using the private clause, after exiting the parallel area, it must be undefined. If the original variable itself has been initialized, it will not be undefined after exiting, is the third note below.

C. There is no storage association between private variables in the parallel zone and variables with the same name outside the parallel zone.

int main(int argc, _TCHAR* argv[]){int C = 100;#pragma omp parallel for private(C)for(int i = 0; i<10;i++){C = 200; printf("%d\n",C);}printf("%d\n",C);return 0;}

Here, after exiting the parallel area, the result of C in printf is 100, which is irrelevant to the operations in the parallel area.

To sum up, the above three points are crossover, and the third point includes all the situations. Therefore, the key understanding of private is:A private variable within the parallel region has no storage association with the same variable outside of the region.Simply put, we can think that private variables in the parallel zone are not associated with those outside the parallel zone. If we have to say that point join is, we need to define this variable before using private, but when we get to the parallel area, each thread in the parallel area will generate a copy of this variable and it is not initialized.

The following is a summary of the above example and can be found in the annotations:

int main(int argc, _TCHAR* argv[]){int A=100,B,C=0;#pragma omp parallel for private(A) private(B)for(int i = 0; i<10;i++){B = A + i;// A is undefined! Runtime error!printf("%d\n",i);}/*--End of OpemMP paralle region. --*/C = B;// B is undefined outside of the parallel region!printf("A:%d\n", A);printf("B:%d\n", B);return 0;}

(2) firstprivate

Private variables of the private clause cannot inherit the values of variables with the same name. firstprivate is used to implement this function-inherit the values of variables outside the parallel zone,Used for initialization before entering the parallel area.

Firstprivate (list): all variables in the list areinitialized with the value the original object had before entering the parallelconstruct.

Analyze the following example:

int main(int argc, _TCHAR* argv[]){int A;#pragma omp parallel for firstprivate(A)for(int i = 0; i<10;i++){printf("%d: %d\n",i, A);// #1}printf("%d\n",A);// #2return 0;}

When vs is used for compilation, a warning "Warning c4700: uninitialized local variable 'A' used" will also be reported, but a is actually used in two places. In fact, this warning is for the second place. We can see that vs did not give a warning to a in the first OpenMP parallel area, because firstprivate was used, the shared variables with the same name will be used for initialization for a in the parallel area. Of course, if we strictly analyze the variables, they are not initialized, theoretically, warning should also be reported. However, specifically, this is related to the implementation of vs. In addition, in debug, the above program will crash, under release, in fact, it is possible to output values. In short, the above output is unpredictable.

Let's look at the following example, which is similar to the previous private example:

int main(int argc, _TCHAR* argv[]){int A = 100;#pragma omp parallel for firstprivate(A)for(int i = 0; i<10;i++){printf("%d: %d\n",i, A);// #1}printf("%d\n",A);// #2return 0;}

Here, if private is used, there is a problem in the parallel area, because a in the parallel area is not initialized, leading to unexpected output or crash. However, after firstprivate is used, as a result, when entering the parallel zone, the copy of a of each thread will use the value of shared variable A with the same name outside the parallel zone for initialization, the output A is 100.

We will continue to discuss "initialize once" here. To understand the meaning of "once", let's look at the example below:

#include <omp.h>int main(int argc, _TCHAR* argv[]){int A = 100;#pragma omp parallel for firstprivate(A)for(int i = 0; i<10;i++){printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A);// #1A = i;}printf("%d\n",A);// #2return 0;}

Here, after each output, change the value of A. Note that the "initialize once" here is an initialization for every thread in the team. For the above program, it runs on a 4-core CPU and has four threads in the parallel area. Therefore, each thread has a copy of a. Therefore, the output result of the above program may be as follows:

In fact, this result is easy to understand. It is impossible that every for has a copy of the variable, but every thread, so this result is expected.

With the help of the above example, we can still help to understand private and firstprivate, which leads to lastprivate. Private has a copy for each thread in the parallel zone and is not associated with variables outside the parallel zone; firstprivate solves the problem of entering the parallel zone, that is, when the replica variables of each thread entering the parallel zone use the shared variables outside the parallel zone for initialization, the following problem is, if you want the replica variables in the parallel zone to be assigned to the shared variables outside the parallel zone when you exit the parallel zone, you need to rely on lastprivate.

(3) lastprivate

If you need to assign the private variable in the parallel area to the shared variable with the same name when exiting the parallel area after calculation, you can use lastprivate.

Lastprivate (list): the thread that executesSequentially lastIteration or section updates thevalue of the objects in the list.

From the last example of firstprivate, we can see that the parallel area assigns a value to a, but after exiting the parallel area, its value is still the original value.

First, there is a problem: after exiting the parallel zone, you need to assign the value of the copy in the parallel zone to the shared variable with the same name. Then, there are multiple threads in the parallel zone, which thread copy is used for value assignment?

Is it the last running thread? No!In the OpenMP specification, it is pointed out that if it is a loop iteration, the value in the last loop iteration is assigned to the corresponding shared variable; if it is a section structure, the value in the last section statement is assigned to the corresponding shared variable. Note that the last section here refers to the last section in the program syntax, rather than the last section in the actual runtime.

Before understanding this sentence, let's take a simple example to understand the role of lastprivate:

int main(int argc, _TCHAR* argv[]){int A = 100;#pragma omp parallel for lastprivate(A)for(int i = 0; i<10;i++){A = 10;}printf("%d\n",A);return 0;}

Here, it is easy to know that the result is 10, not 100. This is the effect of lastprivate. After exiting, there will be a value assignment process.

After understanding the basic meaning of lastprivate, we can continue to understand the description of the above red text section, that is, the question of which thread copy is used to assign values to variables outside the parallel zone, the following example is similar to the previous firstprivate example:

#include <omp.h>int main(int argc, _TCHAR* argv[]){int A = 100;#pragma omp parallel for lastprivate(A)for(int i = 0; i<10;i++){printf("Thread ID: %d, %d\n",omp_get_thread_num(), i);// #1A = i;}printf("%d\n",A);// #2return 0;}

From the results, we can see that the value of the shared variable outside the parallel zone is not the value of the last thread exit. After multiple operations, we find that the output results of the parallel zone may change, but the final output is 9, which is the problem described in the OpenMP specification above. When exiting the parallel zone, it is used to assign values to shared variables according to the last thread in the logic, instead of the last running thread, for, it is the copy value of the thread where the last loop iteration is located, which is used to assign values to shared variables.

In addition, firstprivate and lastprivate use shared variables to initialize (enter) the thread copy and assign values (exit) to the shared variable using the thread copy. Private is the thread copy and shared variable without any association, what if I want to assign a value when initializing and exiting? In fact, you can use firstprivate and lastprivate for the same variable. The following example shows that:

#include <omp.h>int main(int argc, _TCHAR* argv[]){int A = 100;#pragma omp parallel for firstprivate(A) lastprivate(A)for(int i = 0; i<10;i++){printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A);// #1A = i;}printf("%d\n",A);// #2return 0;}

Note: you cannot use private for a variable twice at the same time, or both private and firstprivate/lastprivate can only be used with firstprivate and lastprivate.

For lastprivate, it should also be noted that if a variable of the class type is used in the lastprivate parameter, there are some restrictions for use.Accessible, clear default constructorUnless the variable is also used as a parameter in the firstprivate clause;Copy the value assignment operator. The operation sequence of the copy value assignment operator for different objects is not specified and depends on the compiler definition..

In addition, firstprivate and private can be used for all parallel construction blocks, but lastprivate can only be used for parallel blocks composed of for and section. For more information, see the reference table http://blog.csdn.net/gengshenghong/article/details/69.

(4) threadprivate

First, the difference between threadprivate and the preceding clauses is that threadprivate is a command, not a clause. Threadprivate specifies that the global variable is generated by all the threads of OpenMP and each thread has its own private global variable. The obvious difference is that threadprivate is not for a parallel region, but for the whole program. Therefore, the copy variables copied by threadprivate are global, that isSame threadIt is also shared.

Threadprivate can only be used for global or static variablesIt is easy to understand according to its functions.

Based on the following example, we will further understand the use of threadprivate:

#include <omp.h>int A = 100;#pragma omp threadprivate(A)int main(int argc, _TCHAR* argv[]){#pragma omp parallel forfor(int i = 0; i<10;i++){A++;printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A);// #1}printf("Global A: %d\n",A);// #2#pragma omp parallel forfor(int i = 0; i<10;i++){A++;printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A);// #1}printf("Global A: %d\n",A);// #2return 0;}

The analysis results show that the second parallel region continues to increase on the basis of the first parallel region. Each thread has its own global private variable. In addition, observe the value of "Globa a" printed outside the parallel area. We can see that this value is always the result of the previous Thread 0, which is expected, after exiting the parallel zone, only the master thread runs.

The threadprivate command also has its own clauses, so we will not analyze them here. In addition, if you are using a C ++ class, the class constructor will have some limitations similar to lastprivate.

Summary:

Private, firstprivate, and lastprivate are clauses used to indicate the data range attribute of variables in the parallel zone. Specifically, private indicates that each thread in the team in the parallel area will generate a shared variable with the same name outside the parallel area, and there is no association with the shared variable; firstprivaet is based on private, when entering the parallel area (or when each thread is created, or when the replica variable is constructed), the shared variables outside the parallel area are used for initialization. lastprivate is based on the private, when exiting the parallel area, the copy variable in the parallel area is used to assign values to the shared variable. Since there are multiple copies, OpenMP specifies how to determine which copy to use for assignment. In addition, private cannot be mixed with firstprivate and lastprivate for the same variable. firstprivate and lastprivate can be used for the same variable, and the effect is the combination of the two.

Threadprivate is a command. The difference between threadprivate and private is that private is for variables in the parallel zone, while threadprivate is for global variables.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.