First look at a piece of code, as follows
1#include <stdlib.h>2#include <stdio.h>3 #defineLEN 54 intMainintargcChar**argv) {5 inti;6 floatx=2;7 floatArr[len];8 #pragmaOffload Target (MIC) out (arr)9 for(i=0; i<len;i++){Tenarr[i]=i*3.0f/x; One } A if(Fabs (arr[2]-2*3.0f/x) <1e-6) -printf"Demo is right\n"); - Else theprintf"Demo is wrong,arr[2] ID%f\n", arr[2]);
Meet the requirements of the homepage, will be moved by the staff first page, hope to understand. If you have any questions, please contact [email protected].Site Categories
- return 0 ; +}
Out is the keyword that appears, the meaning of this keyword tells the compiler that the variable/array inside the parentheses needs to be output. The driver will automatically copy the contents of the variable to the corresponding location in memory when the code leaves the mic card. Similarly, the In,inout,nocopy keyword.
In: input. Open up space on the device side and copy the host-side data to the device side.
Out: Output. Space on the device side, the host-side data is copied to the device side when it enters the device side, and the data is copied from the device side to the host side when it leaves the device side.
Nocopy: Do not copy, only create space, do not copy data.
(1) The transmission keyword can have 0 or more, when there are multiple, you can write continuously, can also be separated by commas or spaces. The same transmission keyword can be used more than once in a offload statement, but the same variable name can not appear in one offload multiple times (even in parameters of different keywords).
(2) The transmission keyword is followed by parentheses, and the parameters in parentheses are the variable names of c/s + +.
(3) A variable should be an array name or pointer (specifically a pointer to a dynamic array) or a normal variable (scalar), with multiple variables separated by commas.
(4) When the variable is a pointer, the pointer can only point to a non-pointer variable, that is, a two-dimensional pointer is not supported.
(5) When an array or pointer to an array is a variable, you can specify the starting and length of the arrays.
(6) When the variable is a pointer, you need to add ": Length (len)" After the variable name, without quotation marks, where Len is the number of elements of the dynamic array, if multiple dynamic array elements are the same number, can be written in one place, for example: in (A,b,c:length (20)). The number of elements can be variables.
(7) In addition to the length, there are 5 keywords, such as alloc_if,free_if,align,alloc,into, you need to use the keyword separated by a colon (a transmission keyword can be used with a colon).
(8) The alloc_if and free_if parameters of the judgement type expression, the results of the calculation should be Boolean type. If the alloc_if parameter results are true then open space for the aforementioned variables when entering the device side, if the free_if parameter result is true then the space is released for the aforementioned variable when leaving the device side (example below).
(9) The parameter of the align is a positive integer, which must be the power of the whole number of 2, the meaning is: on the device side open the aforementioned variables, with the length of the align parameter alignment.
(ten) The Alloc parameter is a variable or array name, but can only be passed one-to-two, meaning that the array is copied from the host side to another array on the device side, or vice versa. Into can be used in conjunction with ALLOC,ALLOC_IF,FREE_IF. But cannot be used in conjunction with Inout,nocopy.
Here are some code snippets
1 //The following two lines only open memory on the device side2 #pragmaOffload Target (MIC) nocopy (P:length (SZ) alloc_if (1) free_if (0))3 {};4 //nocopy: Does not need to copy data from the host side, the code snippet exits without copying from the device side back to the host side5 //p: (Length (SZ)): Nocopy is the number of elements named P is SZ (note that although the keyword is length but not the size of the array, but the number of elements) array6 //P: The host side must be declared beforehand, you can declare only one pointer without having to open up space, because if not declared, the mic side does not know the type of P7 //alloc_if (1): Opening up memory8 //free_if (0): Code snippet (offload) does not free memory when exiting9 Ten One //The following two lines copy data from the host side to the device side A #pragmaOffload in (P:length (SZ) alloc_if (0) free_if (0)) -{/*this uses the array p to perform the operation*/} - //in: Copy data from host side to device side the //because P has been referenced in the previous offload code snippet and has not freed up memory, it can be used directly - //alloc_if (0): does not open memory. Because the previous snippet did not delete the memory space - - + //the following two lines prohibit any action that alters the P memory allocation - #pragmaOffload Nocopy (p) +{/*this uses p to perform the operation*/} A //Without an explicit designation, there is no operation to pass in/out/create space/delete space, and there is no need to specify an array length for this use only at - - //the following two rows of data outgoing and free memory - #pragmaOffload out (P:length (SZ) alloc_if (0) free_if (1)) -{/*this uses p to perform the operation*/} - //out: Copying data from the device side to the host side when exiting in //alloc_if (0): Do not create memory space (because no previous release) - //free_if (1): Free space on exit
There is a special case in which alloc_if and free_if are ignored if the transmitted pointer is directed to a static variable on the CPU and the variable is declared by __declspec (mic).
For the In/out/inout statement, there is a more practical syntax, which is part of the transfer array. For example, the following code:
1typedefintarray[Ten][Ten];2 inta[ +][ -];3 int*p;4ARRAY *p;5 int*r[Ten][Ten];6 inti,j;7 struct{inty;} x;8 #pragmaOffload ... int (a)9 #pragmaOffload. Out (a[i:j][:])Ten #pragmaOffload ... in (p[0:100]) One #pragmaOffload. In ((*p) [5][:]) A #pragmaOffload ... out (x, y)
The In/out statement can refer to only one part of the array, and the dimension of the array is denoted by "[]". The 8th line is the most commonly used, which transmits the entire data of the array A. The 9th line transmits part of the array A, where [I:J] specification is the 1th dimension, I represents the starting position of the dimension, j means that the number of 2nd dimensions inside the parentheses is only a colon, omitted before and after, indicating that the 2nd dimension is complete. The content that is transmitted is a [i][0]~a[i+j-1][499]. As shown in this sentence, the length parameter (i,j) can be a variable. Line 10th means that the transmission P points to the array, starting from 0 100 elements, this sentence shows that even if the transmission is a pointer to a dynamic array, you can also use the "[]" form of the array. In line 11th, the 1th dimension has only one parameter 5, meaning the 1th dimension has only one element, that is, the int[5][0]~int[5][9 in this sentence (array is a synonym for int[10][10], and Q is a pointer to int[10][10]. Line 12th indicates that a part of the struct can be transferred.
The above method allows us to easily transfer part of the array, saving the transmission time while also reducing the code changes. In use, it is important to note that although the transmission is part of the array, but in the mic card to open up memory space, it opened up all the space starting from the 1th element, so on the one hand this writing does not reduce memory consumption, on the other hand, when using the array as a whole to use. That is, when ignoring the offload statement. Or, assuming that the program is running on the CPU side, how the code is written, the code should use the same notation when transferring parts of the array, which is designed to avoid maintaining two sets of code. For example: There is an array p[100], when in (P[2:10]), when used on the mic side, the mic side will open up 12 elements of space (p[0]-p[11]), and from the host memory copy p[2]-p[11] data to the mic side memory, So the first effective element still writes P[2], not p[0];
There are also two keywords for this usage, namely alloc and into.
As mentioned earlier, the syntax for transferring part of an array will open up all the memory space (at least from the first element), but sometimes there is no need to open up so much space, so you can use the ALLOC syntax to limit the scope of the open space, such as:
#pragma offload ... in (P[10:100]:alloc (p[5:1000]))
This offload statement first opens up an array of 1000 elements on the device side of P, the array subscript the available range starting from 5, that is 5~10004. Then 100 elements from the host side starting from p[10], i.e. p[10]-p[109, are uploaded to the device side p[10]-p[109] The location. It is important to check the number of sentences that cross the responsibility of the programmer.
The into statement can pass a portion of the host array to another array of devices, and vice versa. For example:
#pragma offload ... in (P[0:500]:into (p1[500:500]))
This offload statement copies the values of the 500 elements starting at the host side p[0] to the appropriate location on the device side p1[500]-p1[999].
Using this method requires the programmer to control the correctness, especially if there are coverage situations such as:
#pragma offload ... in (p[0:600] to (p1[0:600)) in (p[: +): Into(p1[ (+]))
Here the purpose of the array P1 in the two transmission of each other overlap, one is 0~599, the other is 100~499, two transmission in the location of the 100~499 overlap, which will lead to undefined results, that is, in the same offload statement, the order of multiple transmissions is not necessarily.
It is important to note that the into is not simply a simple memory copy, so it is not possible to pass data between arrays of different dimensions, for example;
#ERROR! int rank1[],rank2[[+]; #pragma offload ... out (Rank1:into (RANK2))
Because the RANK1 and rank2 dimensions are inconsistent, they cannot be passed directly.
Data transmission in the mic