In the process of optimization procedures, the regular time to statistics, especially the scientific calculation procedures, only in the understanding of each part of the time taken on the basis of a further optimization and analysis.
But the conventional time function is less accurate, the measurement of a function's execution time may result in zero, but the number of cycles can take up a lot of time, where you might say that you could block out other places to simply measure the time a function executes multiple times, but the compiler might do some optimizations , directly or indirectly affecting the accuracy of the measurement.
POSIX provides a nanosecond (ns=10^-9s)-level measurement function in a Linux environment so that it does not occur when a function is too small to be measured.
#include <stdio.h>#include<time.h>
/ * gcc xx.c -lrt * /
structTimespec diff (structTimespec Tic,structTimespec TOC);structTimespec Accu (structTimespec Total,structtimespec cur);intMain () {structTimespec Tic,toc,dur; clock_gettime (clock_process_cputime_id, &tic); clock_gettime (clock_process_cputime_id, & TOC); Dur=diff (Tic,toc); printf ("%d:%d\n", dur.tv_sec,dur.tv_nsec); return 0;}structTimespec diff (structTimespec Tic,structTimespec TOC) { structTimespec temp; if((TOC.TV_NSEC-TIC.TV_NSEC) <0) {temp.tv_sec= toc.tv_sec-tic.tv_sec-1; Temp.tv_nsec=1000000000+toc.tv_nsec-tic.tv_nsec; } Else{temp.tv_sec= toc.tv_sec-tic.tv_sec; Temp.tv_nsec= toc.tv_nsec-tic.tv_nsec; } returntemp;}structTimespec Accu (structTimespec Total,structtimespec cur) { structTIMESPEC ret; if(total.tv_nsec+cur.tv_nsec>=1000000000) {ret.tv_sec=total.tv_sec+cur.tv_sec+1; Ret.tv_nsec=total.tv_nsec+cur.tv_nsec-1000000000; } Else{ret.tv_sec=total.tv_sec+cur.tv_sec; Ret.tv_nsec=total.tv_nsec+cur.tv_nsec; } returnret;}
With Clock_gettime (clock_process_cputime_id, &tic); The function can measure the execution time of a piece of code very accurately (of course, this code is synchronous code, that is, blocked).
The Diff function is also available to count the time difference between two measurement points, tv_sec represents the integer seconds Tv_nsec represents the decimal part in NS. That means the total time is tv_sec+tv_nsec/1e9 seconds.
Of course the measurement function in FFTW for Linux is rdtsc this need to divide the CPU frequency to get the specific seconds unit. Clock_gettime is relatively easy and quick to do under Linux.
Attached to the results of some of my experiments, I can see the core time-consuming part of the program through this measurement.
Reference:
http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/
Http://man7.org/linux/man-pages/man2/clock_gettime.2.html
https://aufather.wordpress.com/2010/09/08/high-performance-time-measuremen-in-linux/
High-precision high-resolution timing function Linux