In the image processing program, you will see a statement similar to #pragma unroll 4 (note: In DX, you may see [Unroll (3)] for (int i = 0;i < total; i++), which are explained in detail below:
Grammar:
#pragma unroll (n)
#pragma unroll tells the compiler that it should expand n times in a loop (in fact I think it's safe to tell the compiler to loop through the N times), helping to make the software pipelining more likely for those loops that are not easy to expand.
In fact, a lot of times the compiler will automatically judge the various information, but this increases the redundancy overhead, rather than directly to our optimization engineers know something to tell the compiler.
Cases:
int jackerytest [160];
#pragma unroll (4)
for (int i=0;i<160;i)
{
Jackerytest [I]=i;
}
Here you should know that in the process of GPU processing, the operation of Pixel point is parallel operation, so in the shader, you can see this way to improve the program execution efficiency! The above code tells the compiler loop to expand 4 parallel execution of the loop is safe, if the compiler's software pipelining can be opened smoothly and does not consider the software water filling and emptying, then the code above is equivalent to the following code parallel execution,
for (int i=0;i<160;i +=4)
{
Jackerytest [I]=i; Parallel
jackerytest [i +1]=i +1; Parallel
jackerytest [i +2]=i +2; Parallel
jackerytest [i +3]=i +3; Parallel
}
Attention:
(1) The number of cycles is an integer multiple of n
(2) In fact, and #pragma must_interate is generally used in conjunction with, this can be more comprehensive to tell the compiler we know the information, so that the compiler effectively open the software flow.
(3) #pragma must_interat (1) tells the compiler not to cycle through the expansion.
(4) Do not use multiple #pragma must_interat statements, so the compiler does not necessarily carry out the #pragma must_interat
(5) #pragma unroll (n) setting is invalid if compile options such as-o1,-o2,-o3 are set
#pragma unroll 4