Cuda statistical time

Source: Internet
Author: User
Original article address:Cuda statistical time Author:Handsomefriend

Reprinted: http://blog.csdn.net/jdhanhua/article/details/4843653

<1> use the functions in cutil. h
Unsigned int timer = 0;
// Create a timer
Cutcreatetimer (& timer );
// Start timing
Cutstarttimer (timer );
{
// Code segment for Statistics
............
}
// Stop timing
Cutstoptimer (timer );
// Obtain the time from start time to stop
Cutgettimervalue (timer );
// Delete the timer Value
Cutdeletetimer (timer );
 

I do not know the statistical accuracy in this case.

 

<2> clock function in time. h
Clock_t start, finish;
Float costtime;
Start = clock ();
{
// Code segment for Statistics
............
}
Finish = clock ();
// Obtain the time difference between the two records
Costtime = (float) (finish-Start)/clocks_per_sec;
The length of the timer unit is 1 millisecond.Precision is also 1 ms.

 

<3> event
Cudaevent_t start, stop;
Cudaeventcreate (& START );
Cudaeventcreate (& stop );
Cudaeventrecord (START, 0 );
{
// Code segment for Statistics
............
}
Cudaeventrecord (STOP, 0 );
Float costtime;
Cudaeventelapsedtime (& costtime, start, stop );
 
Cudaerror_t cudaeventcreate (cudaevent_t * event) --- create an event object;
Cudaerror_t cudaeventrecord (cudaevent_t event, custream
Stream) --- record events;
Cudaerror_t
Cudaeventelapsedtime (float * time, cudaevent_t start, cudaevent_t
End) --- calculate the time difference between two events;
Cudaerror_t
Cudaeventdestroy (cudaevent_t event) --- destroys the event object.
Calculate the time difference between two events (In milliseconds, the precision is0.5Microseconds). If no event is recorded, this function returns cudaerrorinvalidvalue. If any event in the record uses a non-zero stream, the result is uncertain.

 

The content of the following non-reprinted articles:

This is an example in cuda_c_best_practices_guide:

Cudaevent_t start, stop;

Float time;

Cudaeventcreate (& START );

Cudaeventcreate (& stop );

Cudaeventrecord (START, 0 );

Kernel <grid, threads> (d_odata, d_idata, size_x, size_y, num_reps );

Cudaeventrecord (STOP, 0 );

Cudaeventsynchronize (STOP );

Cudaeventelapsedtime (& time, start, stop );

Cudaeventdestroy (start );

Cudaeventdestroy (STOP );

--------------------------------------------------------------------------------

The following describes how to calculate FPS during rendering.

When rendering through OpenGL glui, an essential function is gludisplayfunc (Display), which controls the rendering content. Therefore, the time statistics operation must also be in this display:

Generally, the OpenGL rendering process is that the CPU sends the command to the GPU and then enables the GPU to perform rendering. However, the statistical time is calculated on the CPU, that is to say, when the CPU transfers the command to a GPU, the CPU will immediately execute the next line of command, so we only count the time when the CPU sends a command, not the time when the GPU renders a frame.

The correct method is to let the CPU wait during GPU rendering until the GPU rendering is complete, and the program returns the CPU. Glfinish () provides this function. Http://www.opengl.org/sdk/docs/man/xhtml/glFinish.xml

After the GPU rendering is complete, a signal is sent to the CPU, which is received by the glutidlefunc (idle) http://www.opengl.org/documentation/specs/glut/spec3/node63.html.

To accurately measure FPS, you can record a time point at the beginning of the display. After the display is executed, put a piece of glupostredisplay () in idle and idle, the program returns to display again. At this time, the timer at the start of display record the current moment again. The difference between the two moments is the time used to render a frame.

When the FPS is very high, you can accumulate time to obtain accurate FPS. For example, a frame may only be 0.000001 ms, but the time of 100000 frames is relatively large.

In addition, the gluswapbuffers will call glfinish () implicitly, which is used in the Cuda SDK. below is the example of the Cuda SDK:

Void computefps ()

{

Framecount ++;

Fpscount ++;

If (fpscount = fpsLimit-1 ){

G_verify = true;

}

If (fpscount = fpslimit ){

Char FPS [256];

Float ifps = 1.f/(cutgetaveragetimervalue (timer)/1000.f );

Sprintf (FPS, "% svolume render: % 3.1f FPS ",

(G_checkrender & g_checkrender-> isqareadback ())? "Autotest:": ""), ifps );

 

Glusetwindowtitle (FPS );

Fpscount = 0;

If (g_checkrender &&! G_checkrender-> isqareadback ())

Fpslimit = (INT) max (ifps, 1.f );

 

Cutilcheckerror (cutresettimer (timer ));

 

Autoqatest ();

}

}

Void display ()

{

Cutilcheckerror (cutstoptimer (timer ));

Computefps ();

Cutilcheckerror (cutstarttimer (timer ));

Render ();

Gluswapbuffers ();

}

Void idle

{

Glupostredisplay ();

}

 

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.