A brief description of thread task scheduling in OpenMP

Source: Internet
Author: User
Tags printf thread

In OpenMP, task scheduling is mainly for parallel for loops, and when the computation of each iteration in the loop is not equal, if you simply assign the same number of iterations to each thread, it may result in an imbalance in the load on each thread, affecting the overall performance of the program.

As in the following code, if each thread performs an evenly distributed number of tasks, some threads end up early and some threads end late:

#include <stdio.h>
#include <omp.h>
    
int main () {
    int a[100][100] = {0};
#pragma omp parallel for
    (int i =0; i < m i++) {for
        (int j = i; J < + j)
            A[i][j] = ((i%7) * (j %13)%23);
    return 0;
}

For this reason, OpenMP provides a schedule clause to implement the scheduling of tasks.

Schedule clause:

Schedule (type[, size]),

The parameter type refers to the type of the dispatch, which can be static,dynamic,guided,runtime four values. Where runtime allows scheduling types to be determined at run time, the actual scheduling strategy is only three of the way ahead.

Back to the column page: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/extra/

The parameter size represents the number of iterations per schedule and must be an integer. This parameter is optional. This parameter cannot be used when the value of type is runtime.

1. Static Dispatch

Most compilers do not use the SCHEDULE clause when the default is static scheduling. Static is determined at compile time, and those loops are executed by which threads.

When size is not used, each thread is assigned a ┌n/t┐ iteration. When size is used, the size iteration is assigned to the thread each time.

Like the following code:

#include <stdio.h>
#include <omp.h>
int main () {
    int a[100][100] = {0};
#pragma omp parallel for schedule (static)
//#pragma omp parallel to schedule (static,5) for
    (int i =0; i < 100; i++) {
        printf ("id=%d i=%dn", Omp_get_thread_num (), i);
    }
    return 0;
}

Execute on quad-core machines:

(1) When the parameter is not used, the 100/4=25,0-24 is executed by the Line 1 line, 25-49 by the Line 2 line, 50-74 by the Line 3 line; 75-99 by the Line 4 line.

(1) The X (x=0,1,2,3) thread performs ((N/5)%4) task when no parameters are used. Which n=0-99.

2. Dynamic Dispatch Dynamics

Dynamic scheduling relies on the state of the runtime to dynamically determine the iterations that a thread performs, that is, when a thread finishes an assigned task, it will pick up the task. Because the time at which the thread is started and executed is uncertain, the thread that the iteration is assigned to is not known in advance.

When you do not use size, you assign iterations to each thread one at a time. When you use size, you assign the size iteration to each thread individually.

Like the following code:

#include <stdio.h>
#include <omp.h>
int main () {
    int a[100][100] = {0};
#pragma omp parallel for schedule (dynamic)
//#pragma omp parallel for schedule (dynamic,5) for
    (int i =0; i < 10 0; i++) {
        printf ("id=%d i=%dn", Omp_get_thread_num (), i);
    }
    return 0;
}

3. Heuristic scheduling guided

The heuristic scheduling method is used for scheduling, and the number of thread iterations is different each time, and the start is larger and gradually decreases.

The size represents the minimum number of iterations per assignment, and is no longer reduced because the number of iterations per assignment is gradually reduced and smaller to size. If you do not know the size, the default size is 1, which is always reduced to 1. What kind of heuristic algorithm is used, we need to refer to the information of the specific compiler and related manuals.

Three methods of operation summary:

Static dispatch statics: Compile and debug each time the loop is fixed by that thread executing. Because the task of each thread is fixed, the possible cyclic tasks are executed quickly, some are slow, and cannot be optimal.

Dynamic dispatch Dynamics: Depending on how fast the thread is executing, the thread that has completed the task automatically requests a new task or task block, and the task block is fixed each time it is picked up.

Heuristic scheduling guided: Each task assigned to the task is first large after the small, exponential decline. When a large number of tasks require a loop, just start assigning a large number of tasks to the thread, the final task is not long, give each thread a small number of tasks, you can achieve the thread task balance.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.