Note: This part of the content is relatively basic, mainly to analyze several confusing OpenMP functions and understand them.
(1) determine the number of parallel regions:
Here, let's review the determination of the number of threads in the parallel area of OpenMP. For a parallel area, there is a team thread to execute. How many threads should be allocated for execution?
The number of thread teams created after OpenMP encounters the parallel command is determined by the following process:
1. Results of the IF clause
2. num_threads settings
3. omp_set_num_threads () library function settings
4. omp_num_threads environment variable settings
5. Default Implementation of the compiler (generally, the total number of threads equals the number of cores of the processor)
(Http://blog.csdn.net/gengshenghong/article/details/6956878 more information)
The values 2, 3, and 4 decrease in priority, that is, the preceding settings can overwrite the subsequent settings. Of course, the num_threads clause will only affect the current parallel zone, omp_set_num_threads overwrites the omp_num_threads environment variables globally during the entire program running.
(2) Several confusing OpenMP Functions
1. omp_get_thread_num
Obtain the thread's num, that is, id. Here, the ID is the ID of the OpenMP team. In OpenMP, the IDs of threads in a team are arranged in an ascending order, 0, 1, 2...
Note: this function can be called outside the parallel area or in the parallel area. In the parallel area, the master thread ID is obtained, that is, 0. In the parallel area, each time this function is executed, the ID of the current execution thread is obtained.
This function is easy to understand and should not be confused with the omp_get_num_threads below.
2. omp_get_num_threads/omp_set_num_threads
Set/obtain the number of threads. This set function is one of the decision methods used to create a team thread after the parallel command is run. It is used to override the setting of the omp_num_threads environment variable.
Note: although the function names indicate a pair of set/get functions, their meanings must be differentiated. Get is performed immediately after the set function, and its value is not necessarily equal to the set result, in most cases, they are not equal!
First understand omp_set_num_threads ():
In terms of function, we know that it is used to override the setting of the Environment Variable omp_num_threds. In usage, note that omp_set_num_threads can only be used outside the parallel zone. If it is used within the parallel zone, when running in debug, "user error 1001: omp_set_num_threads shoshould only be called in serial regions" is output to the console. If it is in release mode, it should be ignored theoretically. In short,Call omp_set_num_threads in the serial code area to set the number of threads..
Then analyze omp_get_num_threads ():
Used to obtain the number of threads in the current thread group (Team). If it is not called in the parallel zone, 1 is returned..
This statement clearly describes the role of get, and obtains the number of threads in the current thread group. Therefore, it is generally called in the parallel area, the returned result is the actual number of running threads in the parallel region determined by the preceding factors. It is not the set value, so it is easy to understand, calling in the serial area will return 1 (so it is generally not called in the serial area ).
Conclusion: omp_set_num_threads is valid in the serial region. omp_get_num_threads obtains the number of threads in the current thread group. Generally, it is called in the parallel region and 1 is returned in the serial region. There is no essentially quantitative relationship between the two functions!
3. omp_get_max_threads:
From the function name, it seems to be "getting the maximum number of Threads". Yes, what does this "Maximum number of Threads" mean? The following is a passage in the OpenMP document:
The omp_get_max_threads routine returns an upper bound onNumber of threads that cocould be used to form a new team if a parallel region without a num_threadsClause were encountered after execution returns
From this routine.
It is clear that this "maximum number" refers to the OpenMP to formNewThe maximum number of threads that a team can create. It should be understood from this that the maximum number is deterministic, and it has nothing to do with calling in the parallel area or in the serial area, this is because it returns the maximum number possible to create a "new" team in the current OpenMP environment. In a simple understanding, this value is determined by the following three factors: omp_set_num_threads, omp_num_threads, and default compiler implementation.
Note: omp_get_max_threads can be called in a serial or parallel region, and the result is the same (if omp_set_num_threads is not called during this period). If omp_set_num_threads is effectively called (called in the serial area, changes the value of calling omp_get_max_threads. In addition, the return value of omp_get_max_threads may be smaller than that of omp_set_num_threads.
4. instance:
The following example uses the above functions. Based on the results, we can analyze some of the content mentioned above:
#include <omp.h>int main(int argc, _TCHAR* argv[]) {printf("ID: %d, Max threads: %d, Num threads: %d \n",omp_get_thread_num(), omp_get_max_threads(), omp_get_num_threads());omp_set_num_threads(5);printf("ID: %d, Max threads: %d, Num threads: %d \n",omp_get_thread_num(), omp_get_max_threads(), omp_get_num_threads());#pragma omp parallel num_threads(5){// omp_set_num_threads(6);// Do not call it in parallel regionprintf("ID: %d, Max threads: %d, Num threads: %d \n",omp_get_thread_num(), omp_get_max_threads(), omp_get_num_threads());}printf("ID: %d, Max threads: %d, Num threads: %d \n",omp_get_thread_num(), omp_get_max_threads(), omp_get_num_threads());omp_set_num_threads(6);printf("ID: %d, Max threads: %d, Num threads: %d \n",omp_get_thread_num(), omp_get_max_threads(), omp_get_num_threads());return 0; }