CUDA program Run CPU 100% problem is a bit of a headache, in the experimental process called the kernel function, and then call Cudamemcpyasync, but now there will be a block in this so-called Async api,strace followed a bit, Found that 99.999% were all
Clock_gettime (Clock_monotonic_raw, {2461, 485666623}) = 0
So there's an inspiration, why don't I write a similar poll function, but I'm polling every 1 minutes, so I can drop the CPU usage.
kernel<<< Dimgrid, Dimblock >>> (D_RESULT_NEXT_IDX); _err = Cudagetlasterror (); if (cudasuccess = = _err) { low_cpu_usage_poll (Qihao);
void Low_cpu_usage_poll (int qihao) { int min = 0; BOOL ready = false; while (1) { sleep]; Second Ready = cudasuccess==cudastreamquery (0); printf ("Low_cpu_usage_poll:%4d min, Cudastreamquery:%s\n", ++min, ready?) "Cudasuccess": "Cudaerrornotready???"); if (ready) { callback (Qihao); return;}}}
When using the kernel function, no longer call any cudaxxxx function, the kernel function is asynchronous, but the subsequent cudaxxxx functions will block until kernel, where the kernel should be called directly after the Low_cpu_ Usage_poll, place all subsequent processing in the callback that is called in Low_cpu_usage_poll.
A workaround for the CUDA program when running CPU 100%