Original article address:Cuda statistical time
Author:Handsomefriend
Reprinted: http://blog.csdn.net/jdhanhua/article/details/4843653
<1> use the functions in cutil. h
Unsigned int timer = 0;
// Create a timer
Cutcreatetimer (& timer );
// Start timing
Cutstarttimer (timer );
{
// Code segment for Statistics
............
}
// Stop timing
Cutstoptimer (timer );
// Obtain the time from start time to stop
Cutgettimervalue (timer );
// Delete the timer Value
Cutdeletetimer (timer );
I do not know the statistical accuracy in this case.
<2> clock function in time. h
Clock_t start, finish;
Float costtime;
Start = clock ();
{
// Code segment for Statistics
............
}
Finish = clock ();
// Obtain the time difference between the two records
Costtime = (float) (finish-Start)/clocks_per_sec;
The length of the timer unit is 1 millisecond.Precision is also 1 ms.
<3> event
Cudaevent_t start, stop;
Cudaeventcreate (& START );
Cudaeventcreate (& stop );
Cudaeventrecord (START, 0 );
{
// Code segment for Statistics
............
}
Cudaeventrecord (STOP, 0 );
Float costtime;
Cudaeventelapsedtime (& costtime, start, stop );
Cudaerror_t cudaeventcreate (cudaevent_t * event) --- create an event object;
Cudaerror_t cudaeventrecord (cudaevent_t event, custream
Stream) --- record events;
Cudaerror_t
Cudaeventelapsedtime (float * time, cudaevent_t start, cudaevent_t
End) --- calculate the time difference between two events;
Cudaerror_t
Cudaeventdestroy (cudaevent_t event) --- destroys the event object.
Calculate the time difference between two events (In milliseconds, the precision is0.5Microseconds). If no event is recorded, this function returns cudaerrorinvalidvalue. If any event in the record uses a non-zero stream, the result is uncertain.
The content of the following non-reprinted articles:
This is an example in cuda_c_best_practices_guide:
Cudaevent_t start, stop;
Float time;
Cudaeventcreate (& START );
Cudaeventcreate (& stop );
Cudaeventrecord (START, 0 );
Kernel <grid, threads> (d_odata, d_idata, size_x, size_y, num_reps );
Cudaeventrecord (STOP, 0 );
Cudaeventsynchronize (STOP );
Cudaeventelapsedtime (& time, start, stop );
Cudaeventdestroy (start );
Cudaeventdestroy (STOP );
--------------------------------------------------------------------------------
The following describes how to calculate FPS during rendering.
When rendering through OpenGL glui, an essential function is gludisplayfunc (Display), which controls the rendering content. Therefore, the time statistics operation must also be in this display:
Generally, the OpenGL rendering process is that the CPU sends the command to the GPU and then enables the GPU to perform rendering. However, the statistical time is calculated on the CPU, that is to say, when the CPU transfers the command to a GPU, the CPU will immediately execute the next line of command, so we only count the time when the CPU sends a command, not the time when the GPU renders a frame.
The correct method is to let the CPU wait during GPU rendering until the GPU rendering is complete, and the program returns the CPU. Glfinish () provides this function. Http://www.opengl.org/sdk/docs/man/xhtml/glFinish.xml
After the GPU rendering is complete, a signal is sent to the CPU, which is received by the glutidlefunc (idle) http://www.opengl.org/documentation/specs/glut/spec3/node63.html.
To accurately measure FPS, you can record a time point at the beginning of the display. After the display is executed, put a piece of glupostredisplay () in idle and idle, the program returns to display again. At this time, the timer at the start of display record the current moment again. The difference between the two moments is the time used to render a frame.
When the FPS is very high, you can accumulate time to obtain accurate FPS. For example, a frame may only be 0.000001 ms, but the time of 100000 frames is relatively large.
In addition, the gluswapbuffers will call glfinish () implicitly, which is used in the Cuda SDK. below is the example of the Cuda SDK:
Void computefps ()
{
Framecount ++;
Fpscount ++;
If (fpscount = fpsLimit-1 ){
G_verify = true;
}
If (fpscount = fpslimit ){
Char FPS [256];
Float ifps = 1.f/(cutgetaveragetimervalue (timer)/1000.f );
Sprintf (FPS, "% svolume render: % 3.1f FPS ",
(G_checkrender & g_checkrender-> isqareadback ())? "Autotest:": ""), ifps );
Glusetwindowtitle (FPS );
Fpscount = 0;
If (g_checkrender &&! G_checkrender-> isqareadback ())
Fpslimit = (INT) max (ifps, 1.f );
Cutilcheckerror (cutresettimer (timer ));
Autoqatest ();
}
}
Void display ()
{
Cutilcheckerror (cutstoptimer (timer ));
Computefps ();
Cutilcheckerror (cutstarttimer (timer ));
Render ();
Gluswapbuffers ();
}
Void idle
{
Glupostredisplay ();
}