After the Cuda is installed, you can use Devicequery to look at the related properties of the GPU, so that you have a certain understanding of the GPU, which will help cuda programming in the future.
#include "cuda_runtime.h" #include "device_launch_parameters.h" #include <stdio.h> #include <stdlib.h> #
include<string.h> int main () {int devicecount;
Cudagetdevicecount (&devicecount);
int dev;
for (dev = 0; dev < devicecount; dev++) {Cudadeviceprop deviceprop;
Cudagetdeviceproperties (&deviceprop, Dev); if (dev = = 0) {if (/*deviceprop.major==9999 && */deviceprop.minor = 9999&&deviceprop.
major==9999) printf ("\ n");
printf ("\ndevice%d:\"%s\ "\ n", Dev, deviceprop.name);
printf ("Total amount of global memory%u bytes\n", deviceprop.totalglobalmem);
printf ("Number of mltiprocessors%d\n", deviceprop.multiprocessorcount);
printf ("Total amount of constant memory:%u bytes\n", deviceprop.totalconstmem); printf ("Total amount the shared memory per block%u byteS\n ", Deviceprop.sharedmemperblock);
printf ("Total number of registers available per block:%d\n", Deviceprop.regsperblock);
printf ("Warp size%d\n", deviceprop.warpsize);
printf ("Maximum number of Threada per block:%d\n", Deviceprop.maxthreadsperblock); printf ("Maximum sizes of each dimension to a block:%d x%d x%d\n", deviceprop.maxthreadsdim[0), Devicepr
OP.MAXTHREADSDIM[1], deviceprop.maxthreadsdim[2]); printf ("Maximum size of each dimension of a grid:%d x%d x%d\n", Deviceprop.maxgridsize[0], deviceprop.maxgridsize
[1], deviceprop.maxgridsize[2]);
printf ("Maximum memory pitch:%u bytes\n", deviceprop.mempitch);
printf ("Texture alignmemt%u bytes\n", deviceprop.texturepitchalignment); printf ("Clock rate%.2f ghz\n", deviceprop.clockrate*1e-6f);
printf ("\ntest passed\n");
GetChar (); }
The number of Nvidia GPU in the system is first obtained by Cudagetdevicecount
, and then the properties of the GPU in the system are obtained through function cudagetdeviceproperties.
Then the most direct way to see the property is to set a breakpoint to be seen by debugging;
Otherwise displayed on the console by printing;
as shown in the run result;
Deviceprop.name is the GPU name, and if there is no GPU, output Device Emulation
Deviceprop.totalglobalmem Returns the size of the global storage, and the memory size must be greater than the size of the data when calculating large data or some large models, such as the 2GB storage size returned by the figure,
Deviceprop.multiprocessorcount returns the number of streaming processors (SM) in the device, the number of SM number of stream Processor (SP) x each SM contains the number of SPS, where Pascal is each sm,64 sp, Maxwell is 128, Kepler is 192, Fermi is 32, and
Deviceprop.totalconstmem returns the size of the constant store, like 64kB
Deviceprop.sharedmemperblock returns the size of the shared memory, which is faster than the global storage, and the
Deviceprop.regsperblock returns the number of registers;
The number of threads Cheng the deviceprop.warpsize return line, and the
Deviceprop.maxthreadsperblock returns the most threads available in a block;
Deviceprop.maxthreadsdim[] Returns the maximum value of each dimension in the 3 dimension in the Block
deviceprop.maxgridsize[] returns the maximum value of the dimensions in the three-dimensional dimension within the grid;
Deviceprop.mempitch returns the maximum value for pitch when snapping to video memory access;
Deviceprop.texturepitchalignment returns the maximum value for its arguments when accessing a texture cell;
Deviceprop.clockrate returns the frequency of video memory;