Latency in microseconds in Windows

Last Update:2018-12-05 Source: Internet

Author: User

Tags intel pentium

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. microsecond-level latency must not be based on messages (settimer function), because a message congestion may affect precision.
And the unit of settimer is millisecond. The actual response time may be about 55 milliseconds.

2. the latency in microseconds cannot be based on interruptions. The fastest clock service program in VxD is the set_global_time_out function.
In order to ensure the precision of 1 ms. Other mounting Int 8 h interrupt processing functions can only ensure the precision of 55 ms.
Yes)

3. Therefore, we can think of the kind of latency Based on loop execution statements in assembly. However, the kind of assembly code is not universal.
CPU frequency.

Therefore, you can use several functions in Windows to write the universal generation.
Code. gettickcout, timegettime, queryperformancecounter.
1) The gettickcout response can only ensure the accuracy of 55 ms
2) timegettime can only ensure the accuracy of 1 ms
3) The queryperformancecounter function does not rely on the number of computing interruptions, but on reading other hardware clocks.
This system does not support Windows 95 or lower systems, but these systems should
No one should use it.

The following is the sample code:
// The large_integer type is similar to a 64-bit integer, which is a union, which contains the Longlong type and two
Union.
// The queryperformancefrequency function returns the number of times your computer's high-precision timer times per second,
// The parameter large_integer. If the return value is false, the current computer hardware does not support high-precision timers.
// The queryperformancecounter function returns the number of times the current timer is recorded, similar to gettickcout.
# Include <windows. h>
# Include <iostream>
Using namespace STD;

Void main (){

Int delaytime = 20; // latency in microseconds.

Large_integer m_liperffreq = {0 };

If (! Queryperformancefrequency (& m_liperffreq ))
{
Cout <"your current computer hardware does not support high-precision timers" <Endl;
Return;
}

Large_integer m_liperfstart = {0 };
Queryperformancecounter (& m_liperfstart );

Large_integer liperfnow = {0 };
For (;;)
{

Queryperformancecounter (& liperfnow );
Double time = (liperfnow. quadpart-
M_liperfstart.quadpart) * 1000000)/(double) m_liperffreq.quadpart );
If (Time> = delaytime)
Break;

}
Cout. Precision (40 );
Cout <"start" <(double) m_liperfstart.quadpart <Endl;
Cout <"end" <(double) liperfnow. quadpart <Endl;
Cout <"Time Precision" <(1/(double) m_liperffreq.quadpart) * 1000000 <"microsecond" <Endl;
Cout <"latency" <(liperfnow. quadpart-m_liperfstart.quadpart)
* 1000000)/(double) m_liperffreq.quadpart) <"microsecond" <Endl;

}

Because Windows is a multitasking system, you only need to ensure that this code is not interrupted by other processes when running Windows.
The latency in microseconds is guaranteed to succeed. The probability of interruption is very small. You can skip this step if the code execution time is lower than one.
Time slice, then 100% won't be interrupted.

In the SDK, you can use the DWORD timegettime (void) function to obtain the system time. The returned value is in milliseconds. Functions that can be used to implement the latency function.
Void delay (DWORD delaytime)
{
DWORD delaytimebegin;
DWORD delaytimeend;
Delaytimebegin = timegettime ();
Do
{
Delaytimeend = timegettime ();
} While (delaytimeend-delaytimebegin <delaytime)
}
Note: The header file # I nclude <mmsystem. h> or # I nclude <windows. h> Add winmm to project> Settings> link> Object/library modules. lib
You can also add # pragma comment (Lib, "winmm. lib") in the file header ")
Command Line: # pre-compile the processing command when Pragma comment (Lib, "XXX. lib") is used, so that VC can add winmm. lib to the Project for compilation.

On Windows, there are two commonly used Timers: timegettime multimedia timer, which provides millisecond-level timer. However, this accuracy is still too rough for many applications. The other is the queryperformancecount counter, which provides a microsecond-level count as the system differs. For real-time graphics processing, multimedia data stream processing, or real-time system construction programmers, using queryperformancecount/queryperformancefrequency is a basic skill.

This article introduces another high-precision timing method that uses the internal timestamp of the Pentium CPU directly. The following discussion mainly benefited from the book Windows graphic programming, page 1-page 17. Interested readers can directly refer to the book. For more information about the rdtsc commands, see the Intel product manual. This article is only used for throwing bricks.
Among Intel Pentium-level CPUs, there is a part called "Time Stamp", which is in the format of a 64-bit unsigned integer, records the number of clock cycles that have elapsed since CPU power-on. Because the current CPU clock speed is very high, this component can achieve the time precision of the nanosecond level. This accuracy is incomparable to the above two methods.

In a CPU above Pentium, a machine command rdtsc (read time stamp counter) is provided to read the timestamp number and save it in The edX: eax register pair. Since the edX: eax register is the register that stores the function return value in the C ++ language on the Win32 platform, we can regard this instruction as a common function call. Like this:

Inline unsigned _ int64 getcyclecount ()
{
_ ASM rdtsc
}

But no, because rdtsc is not directly supported by the C ++ Embedded Assembler, we need to use the _ emit pseudo command to directly embed the machine code form 0x0f, 0x31 of the command, as shown below:

Inline unsigned _ int64 getcyclecount ()
{
_ ASM _ emit 0x0f
_ ASM _ emit 0x31
}

In the future, when a counter is required, you can call the getcyclecount function twice like using a common Win32 API to compare the difference between the two return values, as shown in the following code:

Unsigned long T;
T = (unsigned long) getcyclecount ();
// Do Something time-intensive...
T-= (unsigned long) getcyclecount ();

On page 15th of Windows graphic programming, a class is written to encapsulate this counter. Interested readers can refer to the code of that class. For more precise timing, the author makes a small improvement by calculating and saving the time for executing the rdtsc command by calling the getcyclecount function twice in a row, to get more accurate timing numbers. But I personally think this improvement is of little significance. According to the test on my machine, this command took about dozens to 100 cycles. It was only a tenth of microsecond in the time on the celon MHZ machine. For most applications, this time is completely negligible, and for those applications that are indeed accurate to the order of nanoseconds, this compensation is too rough.

The advantages of this method are:

1. High precision. The timing accuracy can be achieved directly in nanoseconds (each clock cycle on a 1 GHz CPU is One nanosecond), which is hard to achieve by other timing methods.

2. low cost. The timegettime function needs to be linked to the multi-media library winmm. the Lib and queryperformance * functions are supported by hardware (although I have not seen any machines that are not supported) and the kernel library according to msdn instructions, therefore, both of them can only be used on the Windows platform (for precise timing on the DOS platform, refer to the graphic program developer Guide, which provides detailed instructions on the control timer 8253 ). However, the rdtsc command is a CPU command, which is supported by any machine above the Pentium on the i386 platform, or even without platform restrictions (I believe that the i386 UNIX and Linux methods are also applicable, but there is no conditional test), and the function call overhead is the smallest.

3. There is a direct rate relationship with the CPU clock speed. One count is equivalent to 1/second (CPU clock speed Hz), so that as long as you know the CPU clock speed, you can directly calculate the time. This is different from queryperformancecount. The latter must use queryperformancefrequency to obtain the number of times the current counter is counted per second before it can be converted to time.

The disadvantage of this method is:

1. Most of the existing C/C ++ compilers do not directly support the use of rdtsc commands. You need to program the code by embedding the machine code directly, which is troublesome.

2. High Data jitter. In fact, accuracy and stability are always a conflict for any measurement method. If low-precision timegettime is used for timing, the results are basically the same each time. The rdtsc command has different results each time, with hundreds or even thousands of gaps. This is a contradiction inherent in this method of high precision.

We can use the following formula to calculate the maximum length of timing in this method:

Number of seconds since CPU power-on = number of cycles read by rdtsc/CPU clock speed (HZ)

The maximum number that a 64-bit unsigned integer can express is 1.8 × 10 ^ 19. On my celon 800, it can be timed around (the book says it can be timed on a MHz Pentium in, I don't know how this number is obtained, but it is different from my calculations ). In any case, we don't have to worry about overflow.

The following is a few small examples, which briefly compares the usage and accuracy of the three timing methods.

// Timer1.cpp Timer class that uses the rdtsc command // ktimer class definition can be found in Windows graphic programming p15
// Compilation line: CL timer1.cpp/link user32.lib
# Include <stdio. h>
# Include "ktimer. H"
Main ()
{
Unsigned T;
Ktimer timer;
Timer. Start ();
Sleep (1000 );
T = timer. Stop ();
Printf ("lasting time: % d/N", t );
}

// Timer2.cpp uses the timegettime Function
// <Mmsys. h> must be included, but the Windows header file is complex.
// Simple inclusion <windows. h> is relatively lazy :)
// Compilation line: CL timer2.cpp/link winmm. Lib
# Include <windows. h>
# Include <stdio. h>

Main ()
{
DWORD T1, T2;
T1 = timegettime ();
Sleep (1000 );
T2 = timegettime ();
Printf ("begin time: % u/N", T1 );
Printf ("End Time: % u/N", T2 );
Printf ("lasting time: % u/N", (t2-t1 ));
}

// Timer3.cpp uses the queryperformancecounter Function
// Compilation line: CL timer3.cpp/link kernel32.lib
# Include <windows. h>
# Include <stdio. h>

Main ()
{
Large_integer T1, T2, TC;
Queryperformancefrequency (& TC );
Printf ("frequency: % u/N", TC. quadpart );
Queryperformancecounter (& T1 );
Sleep (1000 );
Queryperformancecounter (& T2 );
Printf ("begin time: % u/N", t1.quadpart );
Printf ("End Time: % u/N", t2.quadpart );
Printf ("lasting time: % u/N", (t2.quadpart-t1.quadpart ));
}

//////////////////////////////////////// ////////
// The above three examples are the time required to test the sleep for 1 second.
File: // test/test environment: celeon 800 MHz/256 M SDRAM
// Windows 2000 Professional SP2
// Microsoft Visual C ++ 6.0 SP5
//////////////////////////////////////// ////////

The following are the running results of timer1, using the high-precision rdtsc command.
Lasting Time: 804586872

The following is the running result of timer2, using the rough timegettime API
Begin time: 20254254
Endtime: 20255255
Lasting Time: 1001

The following is the running result of timer3, using the queryperformancecount API
Frequency: 3579545
Begin time: 3804729124
Endtime: 3808298836
Lasting Time: 3569712

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More