(1) shared
A shared clause can be used to declare one or more variables as shared variables. The so-called shared variable is that all threads in a team in a parallel area only have one memory address of the variable, and all threads access the same address. Therefore, for shared variables in parallel areas, we need to consider the data competition conditions. To prevent competition, we need to increase the corresponding protection. This is what programmers need to consider on their own.
The following example shows a parallel implementation of summation. If shared variables are used, there will be data competition because protection is not taken:
#define COUNT10000int main(int argc, _TCHAR* argv[]){int sum = 0;#pragma omp parallel for shared(sum)for(int i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",sum);return 0;}
Run multiple times and the results may be different.
Note that,Loop iteration variables are private in the loop construction area, and the automatic variables declared in the loop construction area are private.. This is actually easy to understand. It is hard to imagine how to execute OpenMP if loop iteration variables are common. Therefore, they can only be private. Even if you use shared to modify the cyclic iteration variables, the cyclic iteration variables are private in the cyclic structure area:
#define COUNT10int main(int argc, _TCHAR* argv[]){int sum = 0;int i = 0;#pragma omp parallel for shared(sum, i)for(i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",i);printf("%d\n",sum);return 0;}
This example shows the problem. The final output I here is 0, which is not a possible value from 0 to count, although the shared variable I is used here. Note that the rules here are only for the circular parallel area, and there is no such requirement for other parallel areas:
#define COUNT10int main(int argc, _TCHAR* argv[]){int sum = 0;int i = 0;#pragma omp parallel shared(sum) private(i)for(i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",i);printf("%d\n",sum);return 0;}
Here, the output I is 0. If it is changed to shared, it is 10. Of course, this code is different from the sum above.
In addition, by the way,In the loop parallel area, the loop iteration variable cannot be modified.. This is also the example above. Why not use the following statement:
int i = 0;#pragma omp parallel for shared(sum) shared(i)for(i = 0; i < COUNT;i++){i++;sum = sum + i;}
Here, I ++ reports an error because iteration variable I is readable and unwritable in the loop parallel area.
(2) Default
Default specifies the attributes of variables in the parallel zone. In OpenMP of C ++, default parameters can only be shared or none. For Fortran, private parameters can be used. For details, refer to the manual.
Default (shared): indicates that shared variables in the parallel area are all shared attributes when they are not specified.
Default (none): The data attributes of all shared variables must be explicitly specified; otherwise, an error is reported, unless the variable has a clear attribute definition (for example, the loop iteration variable in the loop parallel area can only be private)
What happens if a parallel region does not use the default clause? In actual tests, I personally think that if default is not used, the default behavior is default (shared ).
#define COUNT10int main(int argc, _TCHAR* argv[]){int sum = 0;int i = 0;#pragma omp parallel forfor(i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",i);printf("%d\n",sum);return 0;}
Here, sum is the shared property, while the property of I is not changed and can only be private. The effect here is the same as that with default (shared. If default (none) is used, the compilation will report "no data sharing attribute is specified for sum", and no error is reported for variable I, because I has a clear meaning and can only be private.
(3) copyin
The copyin clause is used to copy the value of threadprivate variable in the main thread to the threadprivate variable of each thread in the execution parallel area, so that the child thread in the team has the same initial value as the main thread.
Refer to increment (in the second parallel area, add the clause copyin ):
#include <omp.h> int A = 100; #pragma omp threadprivate(A) int main(int argc, _TCHAR* argv[]) { #pragma omp parallel for for(int i = 0; i<10;i++) { A++; printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A); // #1 } printf("Global A: %d\n",A); // #2 #pragma omp parallel for copyin(A) for(int i = 0; i<10;i++) { A++; printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A); // #1 } printf("Global A: %d\n",A); // #2 return 0; }
Running this program produces different results than without copyin. Without copyin, when entering the second parallel region, the initial value of private copy a of different threads is different. After copyin is used, it is found that the initial values of all threads are initialized using the value of the main thread, and the operation continues.
To better understand copyin, analyze the following example:
#include <omp.h> int A = 100; #pragma omp threadprivate(A) int main(int argc, _TCHAR* argv[]) { #pragma omp parallel{printf("Initial A = %d\n", A);A = omp_get_thread_num();}printf("Global A: %d\n",A);#pragma omp parallel copyin(A)// copyin{printf("Initial A = %d\n", A);A = omp_get_thread_num();}printf("Global A: %d\n",A);#pragma omp parallel// Will not copy, to check the result.{printf("Initial A = %d\n", A);A = omp_get_thread_num();}printf("Global A: %d\n",A);return 0;
After copyin is used, all the threadprivate replicas of all threads are synchronized with the primary thread's replicas ".
In addition,Parameters in copyin must be declared as threadprivateA variable of the class type must have a clear copy assignment operator. Moreover, for the first parallel area, the copyin function is provided by default (for example, the output of the first four A in the preceding example is 100 ). One of the possible scenarios of copyin is that, for example, a program has multiple parallel regions, and each thread wants to save a private global variable, but before executing a parallel zone, to be the same as the value of the main thread, you can use copyin to assign values.
(4) copyprivate
The copyprivate clause is used to broadcast the value of the private copy variable of a thread from one thread to the same variable of other threads that execute the same parallel zone.
Note: copyprivate can only be used in the Single Instruction clause to complete the broadcast operation at the end of a single block. Copyprivate can only be used for private, firstprivate, or threadprivate modified variables.
According to the following program, you can understand the use of copyprivate:
#include <omp.h> int A = 100; #pragma omp threadprivate(A) int main(int argc, _TCHAR* argv[]) { int B = 100;int C = 1000;#pragma omp parallel firstprivate(B) copyin(A)// copyin(A) can be ignored!{#pragma omp single copyprivate(A) copyprivate(B)// copyprivate(C)// C is shared, cannot use copyprivate!{A = 10;B = 20;}printf("Initial A = %d\n", A);// 10 for all threadsprintf("Initial B = %d\n", B);// 20 for all threads}printf("Global A: %d\n",A);// 10printf("Global A: %d\n",B);// 100. B is still 100! Will not be affected here!return 0; }
(1) shared
A shared clause can be used to declare one or more variables as shared variables. The so-called shared variable is that all threads in a team in a parallel area only have one memory address of the variable, and all threads access the same address. Therefore, for shared variables in parallel areas, we need to consider the data competition conditions. To prevent competition, we need to increase the corresponding protection. This is what programmers need to consider on their own.
The following example shows a parallel implementation of summation. If shared variables are used, there will be data competition because protection is not taken:
#define COUNT10000int main(int argc, _TCHAR* argv[]){int sum = 0;#pragma omp parallel for shared(sum)for(int i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",sum);return 0;}
Run multiple times and the results may be different.
Note that,Loop iteration variables are private in the loop construction area, and the automatic variables declared in the loop construction area are private.. This is actually easy to understand. It is hard to imagine how to execute OpenMP if loop iteration variables are common. Therefore, they can only be private. Even if you use shared to modify the cyclic iteration variables, the cyclic iteration variables are private in the cyclic structure area:
#define COUNT10int main(int argc, _TCHAR* argv[]){int sum = 0;int i = 0;#pragma omp parallel for shared(sum, i)for(i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",i);printf("%d\n",sum);return 0;}
This example shows the problem. The final output I here is 0, which is not a possible value from 0 to count, although the shared variable I is used here. Note that the rules here are only for the circular parallel area, and there is no such requirement for other parallel areas:
#define COUNT10int main(int argc, _TCHAR* argv[]){int sum = 0;int i = 0;#pragma omp parallel shared(sum) private(i)for(i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",i);printf("%d\n",sum);return 0;}
Here, the output I is 0. If it is changed to shared, it is 10. Of course, this code is different from the sum above.
In addition, by the way,In the loop parallel area, the loop iteration variable cannot be modified.. This is also the example above. Why not use the following statement:
int i = 0;#pragma omp parallel for shared(sum) shared(i)for(i = 0; i < COUNT;i++){i++;sum = sum + i;}
Here, I ++ reports an error because iteration variable I is readable and unwritable in the loop parallel area.
(2) Default
Default specifies the attributes of variables in the parallel zone. In OpenMP of C ++, default parameters can only be shared or none. For Fortran, private parameters can be used. For details, refer to the manual.
Default (shared): indicates that shared variables in the parallel area are all shared attributes when they are not specified.
Default (none): The data attributes of all shared variables must be explicitly specified; otherwise, an error is reported, unless the variable has a clear attribute definition (for example, the loop iteration variable in the loop parallel area can only be private)
What happens if a parallel region does not use the default clause? In actual tests, I personally think that if default is not used, the default behavior is default (shared ).
#define COUNT10int main(int argc, _TCHAR* argv[]){int sum = 0;int i = 0;#pragma omp parallel forfor(i = 0; i < COUNT;i++){sum = sum + i;}printf("%d\n",i);printf("%d\n",sum);return 0;}
Here, sum is the shared property, while the property of I is not changed and can only be private. The effect here is the same as that with default (shared. If default (none) is used, the compilation will report "no data sharing attribute is specified for sum", and no error is reported for variable I, because I has a clear meaning and can only be private.
(3) copyin
The copyin clause is used to copy the value of threadprivate variable in the main thread to the threadprivate variable of each thread in the execution parallel area, so that the child thread in the team has the same initial value as the main thread.
Refer to increment (in the second parallel area, add the clause copyin ):
#include <omp.h> int A = 100; #pragma omp threadprivate(A) int main(int argc, _TCHAR* argv[]) { #pragma omp parallel for for(int i = 0; i<10;i++) { A++; printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A); // #1 } printf("Global A: %d\n",A); // #2 #pragma omp parallel for copyin(A) for(int i = 0; i<10;i++) { A++; printf("Thread ID: %d, %d: %d\n",omp_get_thread_num(), i, A); // #1 } printf("Global A: %d\n",A); // #2 return 0; }
Running this program produces different results than without copyin. Without copyin, when entering the second parallel region, the initial value of private copy a of different threads is different. After copyin is used, it is found that the initial values of all threads are initialized using the value of the main thread, and the operation continues.
To better understand copyin, analyze the following example:
#include <omp.h> int A = 100; #pragma omp threadprivate(A) int main(int argc, _TCHAR* argv[]) { #pragma omp parallel{printf("Initial A = %d\n", A);A = omp_get_thread_num();}printf("Global A: %d\n",A);#pragma omp parallel copyin(A)// copyin{printf("Initial A = %d\n", A);A = omp_get_thread_num();}printf("Global A: %d\n",A);#pragma omp parallel// Will not copy, to check the result.{printf("Initial A = %d\n", A);A = omp_get_thread_num();}printf("Global A: %d\n",A);return 0;
After copyin is used, all the threadprivate replicas of all threads are synchronized with the primary thread's replicas ".
In addition,Parameters in copyin must be declared as threadprivateA variable of the class type must have a clear copy assignment operator. Moreover, for the first parallel area, the copyin function is provided by default (for example, the output of the first four A in the preceding example is 100 ). One of the possible scenarios of copyin is that, for example, a program has multiple parallel regions, and each thread wants to save a private global variable, but before executing a parallel zone, to be the same as the value of the main thread, you can use copyin to assign values.
(4) copyprivate
The copyprivate clause is used to broadcast the value of the private copy variable of a thread from one thread to the same variable of other threads that execute the same parallel zone.
Note: copyprivate can only be used in the Single Instruction clause to complete the broadcast operation at the end of a single block. Copyprivate can only be used for private, firstprivate, or threadprivate modified variables.
According to the following program, you can understand the use of copyprivate:
#include <omp.h> int A = 100; #pragma omp threadprivate(A) int main(int argc, _TCHAR* argv[]) { int B = 100;int C = 1000;#pragma omp parallel firstprivate(B) copyin(A)// copyin(A) can be ignored!{#pragma omp single copyprivate(A) copyprivate(B)// copyprivate(C)// C is shared, cannot use copyprivate!{A = 10;B = 20;}printf("Initial A = %d\n", A);// 10 for all threadsprintf("Initial B = %d\n", B);// 20 for all threads}printf("Global A: %d\n",A);// 10printf("Global A: %d\n",B);// 100. B is still 100! Will not be affected here!return 0; }