CUDA 4, CUDA

Last Update:2015-05-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

CUDA 4, CUDA
Device management

NVIDIA provides centralized query and Management of GPU devices. It is very important to know GPU information query, because it can help you set the kernel execution configuration.

This blog will mainly introduce the following two aspects:

CUDA runtime API function
NVIDIA system management command line

Use the runtime API to query GPU Information

You can use the following function to query all GPU device information:

CudaError_t cudaGetDeviceProperties (cudaDeviceProp * prop, int device );

GPU information is stored in the cudaDeviceProp struct.

Code

#include <cuda_runtime.h>
#include <stdio.h>
int main(int argc, char **argv) { 
　　printf("%s Starting...\n", argv[0]); int deviceCount = 0; cudaError_t error_id = cudaGetDeviceCount(&deviceCount); if (error_id != cudaSuccess) { printf("cudaGetDeviceCount returned %d\n-> %s\n", (int)error_id, cudaGetErrorString(error_id)); printf("Result = FAIL\n"); exit(EXIT_FAILURE); } if (deviceCount == 0) { printf("There are no available device(s) that support CUDA\n"); } else { printf("Detected %d CUDA Capable device(s)\n", deviceCount); }
 int dev, driverVersion = 0, runtimeVersion = 0; dev =0; cudaSetDevice(dev); cudaDeviceProp deviceProp; cudaGetDeviceProperties(&deviceProp, dev); printf("Device %d: \"%s\"\n", dev, deviceProp.name); cudaDriverGetVersion(&driverVersion); cudaRuntimeGetVersion(&runtimeVersion); printf(" CUDA Driver Version / Runtime Version %d.%d / %d.%d\n",driverVersion/1000, (driverVersion%100)/10,runtimeVersion/1000, (runtimeVersion%100)/10); printf(" CUDA Capability Major/Minor version number: %d.%d\n",deviceProp.major, deviceProp.minor); printf(" Total amount of global memory: %.2f MBytes (%llu bytes)\n",(float)deviceProp.totalGlobalMem/(pow(1024.0,3)),(unsigned long long) deviceProp.totalGlobalMem); printf(" GPU Clock rate: %.0f MHz (%0.2f GHz)\n",deviceProp.clockRate * 1e-3f, deviceProp.clockRate * 1e-6f); printf(" Memory Clock rate: %.0f Mhz\n",deviceProp.memoryClockRate * 1e-3f); printf(" Memory Bus Width: %d-bit\n",deviceProp.memoryBusWidth); if (deviceProp.l2CacheSize) { printf(" L2 Cache Size: %d bytes\n", deviceProp.l2CacheSize); }
 printf(" Max Texture Dimension Size (x,y,z) 1D=(%d), 2D=(%d,%d), 3D=(%d,%d,%d)\n", deviceProp.maxTexture1D , deviceProp.maxTexture2D[0], deviceProp.maxTexture2D[1], deviceProp.maxTexture3D[0], deviceProp.maxTexture3D[1], deviceProp.maxTexture3D[2]);
 printf(" Max Layered Texture Size (dim) x layers 1D=(%d) x %d, 2D=(%d,%d) x %d\n", deviceProp.maxTexture1DLayered[0], deviceProp.maxTexture1DLayered[1], deviceProp.maxTexture2DLayered[0], deviceProp.maxTexture2DLayered[1], deviceProp.maxTexture2DLayered[2]);
 printf(" Total amount of constant memory: %lu bytes\n",deviceProp.totalConstMem); printf(" Total amount of shared memory per block: %lu bytes\n",deviceProp.sharedMemPerBlock); printf(" Total number of registers available per block: %d\n",deviceProp.regsPerBlock); printf(" Warp size: %d\n", deviceProp.warpSize); printf(" Maximum number of threads per multiprocessor: %d\n",deviceProp.maxThreadsPerMultiProcessor); printf(" Maximum number of threads per block: %d\n",deviceProp.maxThreadsPerBlock);
 printf(" Maximum sizes of each dimension of a block: %d x %d x %d\n", deviceProp.maxThreadsDim[0], deviceProp.maxThreadsDim[1], deviceProp.maxThreadsDim[2]);
 printf(" Maximum sizes of each dimension of a grid: %d x %d x %d\n", deviceProp.maxGridSize[0], deviceProp.maxGridSize[1], deviceProp.maxGridSize[2]);
 printf(" Maximum memory pitch: %lu bytes\n", deviceProp.memPitch);
 exit(EXIT_SUCCESS);}

Compile and run:

$ nvcc checkDeviceInfor.cu -o checkDeviceInfor$ ./checkDeviceInfor

Output:

./checkDeviceInfor Starting...Detected 2 CUDA Capable device(s)Device 0: "Tesla M2070"CUDA Driver Version / Runtime Version 5.5 / 5.5CUDA Capability Major/Minor version number: 2.0Total amount of global memory: 5.25 MBytes (5636554752 bytes)GPU Clock rate: 1147 MHz (1.15 GHz)Memory Clock rate: 1566 MhzMemory Bus Width: 384-bitL2 Cache Size: 786432 bytesMax Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048Total amount of constant memory: 65536 bytesTotal amount of shared memory per block: 49152 bytesTotal number of registers available per block: 32768Warp size: 32Maximum number of threads per multiprocessor: 1536Maximum number of threads per block: 1024Maximum sizes of each dimension of a block: 1024 x 1024 x 64Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535Maximum memory pitch: 2147483647 bytes

Determine the best GPU

For systems that support multiple GPUs, we need to choose one of them as our device. One way to choose the best computing performance GPU is determined by the number of processors it has, you can use the following code to select the best GPU.

int numDevices = 0;cudaGetDeviceCount(&numDevices);if (numDevices > 1) {    int maxMultiprocessors = 0, maxDevice = 0;    for (int device=0; device<numDevices; device++) {        cudaDeviceProp props;        cudaGetDeviceProperties(&props, device);        if (maxMultiprocessors < props.multiProcessorCount) {            maxMultiprocessors = props.multiProcessorCount;            maxDevice = device;        }    }    cudaSetDevice(maxDevice);}

Use nvidia-smi to query GPU Information

Nvidia-smi is a command line tool that helps you manage and operate GPU devices and allows you to query and change device statuses.

Nvidia-smi is very useful. For example, the following command:

$ nvidia-smi -LGPU 0: Tesla M2070 (UUID: GPU-68df8aec-e85c-9934-2b81-0c9e689a43a7)GPU 1: Tesla M2070 (UUID: GPU-382f23c1-5160-01e2-3291-ff9628930b70)

Then you can use the following command to query detailed GPU 0 information:

$nvidia-smi –q –i 0

The following are some parameters of the command to streamline the display information of nvidia-smi:

MEMORY

UTILIZATION

ECC

TEMPERATURE

POWER

CLOCK

COMPUTE

PIDS

PERFORMANCE

SUPPORTED_CLOCKS

PAGE_RETIREMENT

ACCOUNTING

For example, display only device memory information:

$nvidia-smi –q –i 0 –d    MEMORY | tail –n 5Memory UsageTotal : 5375 MBUsed : 9 MBFree : 5366 MB

Set device

For multi-GPU systems, you can use nvidia-smi to view GPU attributes. Each GPU is marked from 0 and the environment variable CUDA_VISIBLE_DEVICES can be used to specify the GPU without modifying the application.

You can set the environment variable CUDA_VISIBLE_DEVICES-2 to shield other GPUs so that only GPU2 can be used. Of course you can also use CUDA_VISIBLE_DEVICES-2, 3 to set multiple GPUs, their device IDs are 0 and 1 respectively.

Download Code: CodeSamples.zip

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

CUDA 4, CUDA

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

CUDA 4, CUDA

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support