1, when running TensorFlow and other programs will be used to the NVIDIA GPU, so the program needs to monitor the operation of the GPU
Using the nvidia-smi command, the following is displayed:
Nvidia-smi Display Interpretation:
GPU: GPU number in this machine, 0,1,2, etc.
NAME:GPU type, GTX1080, Tesla K80, etc.
Persistence-m: is a state of continuous mode, although the persistence mode consumes a lot of energy, but it takes less time to start the new GPU application, which shows the state of off
Fan: Fans speed, change from 0到100%, this speed is the computer expected fan speed, in fact, if the fan blocked, may not hit the displayed speed. Some devices do not return rotational speeds because they do not rely on fan cooling but are kept low by other peripherals
Temp: TEMPERATURE, Unit Celsius
Perf: Characterizing performance states, representing maximum performance from P0 to P12,p0, P12 representing state minimum performance
Pwr:usage/cap: Energy consumption indication
Bus-id: Related information about the GPU bus
Disp.a:display Active, which indicates whether the GPU display is initialized
Memory-usage: Memory Utilization
Volatile gpu-util: Floating GPU Utilization
Uncorr. ECC: Something about ECC
Compute M.: Calculation mode
Processes shows the memory usage per process on each GPU.
Note: Video memory usage and GPU occupancy are two different things, the graphics card is composed of GPU and memory, and the relationship between the memory and the GPU is a bit similar to the relationship between RAM and CPU.
nvidia-smi-l command: List all available NVIDIA devices
Shown below:
Watch - n - nvidia-smi Command: Periodic display of the GPU, 10 indicates every 10 seconds
2. Commands for viewing CPU usage
See:Ubuntu View System resource consumption (memory, CPU and process)
Monitor GPU and CPU usage under Linux