High-performance Programming Technology on Linux and Windows 2000

Source: Internet
Author: User
Tags microsoft c

Welcome to this column. This new Linux column demonstrates and compares the performance of the Linux and Windows 2000 operating systems. Columnist ed Bradford compares the operating system-level features, rather than applications, so that people can understand the best performance characteristics of each operating system. This article contains the source code, which represents the "best programming instance" for each platform in an environment as fair as possible ".

In this new article series, we will mainly discuss high-performance programming technologies for Linux and Windows 2000 operating systems. I will demonstrate practical and effective programming instances that can solve the same problem on Linux and Windows 2000. After the problem is solved, you can at least measure the performance of each platform. Each performance test script and program will display the speed of operating system features. My goal is to demonstrate how to obtain the best performance possible for each operating system. By the way, I will compare the performance of these two operating platforms.

Performance testing Overview

These tests will detect memory speed, system call speed, input and output sum, context switching speed, and many other programming tools that are common on these two operating platforms. However, access to the Windows registry is not measured. The source code is published in this article. You can also download the source code for free. Here we are pursuing constructive comments. My goal is to first show the best programming instance-then compare the performance. You are welcome to express your views on this article on the forum.

We regard Linux as two operating systems: Linux 2.2.16 kernel and 2.4.2 kernel. Windows 2000 is a Windows 2000 2195 system. It is called the Windows XP kernel when it is released. All tests are based on identical hardware. The preferred hardware is dual-boot IBM ThinkPad 600x (with 576 MB memory and two 12 GB disks ).

Although object-oriented code is rarely used, it is still written in C ++. The reason for using C ++ is that C has a strong type check feature. On Linux, use the GCC with Red Hat 7.0 distribution. On Windows 2000, use Microsoft C ++ version 12.00.8168 under Visual Studio 6.0.



Back to Top

Measurement Utility

The first article defines the utilities required to measure and report measurement results for Windows 2000 and Linux. Few listed tools: interfaces used to measure the time, routines used to return the string that describes the operating system, and simple and effective interfaces used to allocate memory for malloc, and input routines for processing large numbers. (The timing routine will be discussed in detail below ).

The memory distributor here is called malloc (INT ). What it does is call the malloc (INT) routine. If malloc (INT) fails, malloc () Prints an error message and exits. This routine is not used to test the Memory Allocation performance, but it is also a streamlined code. Here we use malloc () because this function is available in both Windows 2000 and Linux. Malloc () is fair and equivalent in both systems.

The following routine is atoik (char *). This routine is the same as atoi (), but it has a suffix "K" or "M ". Suffix "K" or "M": For "K", multiply the parsed number by 1024. For "M", multiply the parsed number by 1024*1024. "K" and "m" can be case-sensitive and can be appended with any number. Atoik () returns only 32 bits. Therefore, it cannot continue to run when any number greater than or equal to 2 GB is displayed. When this problem occurs, use the atoik64 () function we have compiled. This routine is the same in two operating systems.



Back to Top

Measurement time

How do I measure the time on these two operating systems? Let's take a look at several options. There are two APIs in Windows 2000 that can measure the time interval. The first one is gettickcount (). This function reports the number of milliseconds that have elapsed since the system was started. Gettickcount () is based on the "clock reporting signal" granularity. This means that this function will update this value only when the system sends a clock report signal. In Windows, the update interval is 10 ms. Therefore, its granularity cannot exceed 10 ms or 10000.

Windows 2000 also has a queryperformancecounter () API that fixes the current value of a 64-bit high-resolution performance counter. Each "report" in the result of calling queryperformancecounter () depends on the value returned by queryperformancefrequency. The frequency is the increase of the counter per second, so the second can be expressed:

Use queryperformancecounter () to calculate the second

     LARGE_INTEGER tim, freq;     double seconds;     QueryPerformanceCounter(&tim);     QeryPerformanceFrequency(&freq);     seconds = (double)tim / (double) freq;

Because the gettickcount () resolution is too low, we will only use queryperformancecounter. We must note that if the timing is as short as the cost of queryperformancecounter () API, our results may be unreliable. Next we will measure the overhead of the timing routine.

On Linux, use the gettimeofday () API. Only this API can meet our needs in milliseconds.

After selecting the APIs to be used, you need to define your own APIs so that the program can use these Apis without knowing the host operating system. We use the following interface to implement these functions:

Timer routine Interface

     void tstart();     void tend();     double tval();

When tstart () is called, it records the time value in the static memory. When tend () is called, it records the time value in the static memory. Tval () uses tstart and tend time values to convert them into double-precision values, and then subtract them to return double-precision results. This interface is easy to implement in Linux and Windows, and it executes the required timing function.

The implementation of timing routines in Linux and Windows 2000 is as follows. Since dependency on the system cannot be avoided, our goal is to write the best code when condition definitions are minimized. The following is a list of timing routines.

Timing routine

    #ifdef _WIN32    static LARGE_INTEGER _tstart, _tend;    static LARGE_INTEGER freq;    void tstart(void)    {        static int first = 1;        if(first) {            QueryPerformanceFrequency(&freq);            first = 0;        }        QueryPerformanceCounter(&_tstart);    }    void tend(void)    {        QueryPerformanceCounter(&_tend);    }    double tval()    {        return ((double)_tend.QuadPart -                    (double)_tstart.QuadPart)/((double)freq.QuadPart);    }    #else    static struct timeval _tstart, _tend;    static struct timezone tz;    void tstart(void)    {        gettimeofday(&_tstart, &tz);    }    void tend(void)    {        gettimeofday(&_tend,&tz);    }    double tval()    {        double t1, t2;        t1 =  (double)_tstart.tv_sec + (double)_tstart.tv_usec/(1000*1000);        t2 =  (double)_tend.tv_sec + (double)_tend.tv_usec/(1000*1000);        return t2-t1;    }    #endif

The last routine is "char * ver ()". This simple function returns a string that describes the current operating system environment. As you can see in the source code, it is completely different on every operating platform. At the end of this routine is the conditional definition of the main () routine for testing. The compilation process is as follows:

Compile ver. cpp into a program

Gcc-dmain-O2 ver. cpp-o ver or Cl-dmain-O2 ver. cpp-O ver.exe

The ver.exe program is used to record the version information of the operating system to the output file.

Ver. cpp-print the operating system version

   #ifdef _WIN32    #include <windows.h>    #else    #include <sys/utsname.h>    #endif    #include <stdio.h>    int ver_underbars = 0;    char *ver()    {        char *q;    #ifdef _WIN32        static char verbuf[256];    #else        static char verbuf[4*SYS_NMLN + 4];    #endif    #ifdef _WIN32        OSVERSIONINFO VersionInfo;        VersionInfo.dwOSVersionInfoSize = sizeof(VersionInfo);        if(GetVersionEx(&VersionInfo)) {            if(strlen(VersionInfo.szCSDVersion) > 200)                VersionInfo.szCSDVersion[100] = 0;            sprintf(verbuf, "Windows %d.%d build%d PlatformId %d SP=/"%s/"",                VersionInfo.dwMajorVersion,                VersionInfo.dwMinorVersion,                VersionInfo.dwBuildNumber,                VersionInfo.dwPlatformId,                VersionInfo.szCSDVersion);        }        else {            strcpy(verbuf, "WINDOWS UNKNOWN");        }    #else        struct utsname ubuf;        if(uname(&ubuf)) {            strcpy(verbuf, "LINUX UNKNOWN");        }        else {            sprintf(verbuf,"%s %s %s %s",                ubuf.sysname,                ubuf.release,                ubuf.version,                ubuf.machine);        }    #endif        // Substitute an underbar for white space. Makes output        // easier to parse.        if(ver_underbars) {            for(q = verbuf; *q; q++)                if(*q == ' '  || *q == '/t' || *q == '/n' ||                   *q == '/r' || *q == '/b' || *q == '/f')                    *q = '_';        }        return verbuf;    }    // gcc -DMAIN ver.cpp -o ver -- produces a simple test program.    #ifdef MAIN    int main(int ac, char *av)    {        if(ac > 1) ver_underbars = 1;        printf("%s/n", ver());        return 0;    }    #endif

The timer function defined above can meet our needs. Before using them, you should know how long they will be executed. In fact, we only need to know how long tstart () and tend () will be executed. Because these two functions are in the same form, you only need to calculate the execution time of one of them. In Windows 2000 and Linux, use a time-timers.cpp program to perform Timing Analysis on timing functions. Note that only main () routines are listed here. The actual program includes all timer functions and the atoik () source code in the previous list.

Time-timers.cpp-Timer Program

    char *applname;    int main(int ac, char *av[])    {        long count = 100000;        long i;        double t;        char *v = ver();        char *q;        applname = av[0];        if(strrchr(applname,SLASHC))            applname = strrchr(applname,SLASHC) + 1;        if(ac > 1) {            count = atoik(av[1]);            ac--;            av++;            if(count < 0)                count = 100000;        }        tstart();        for(i = 0; i < count; i++)            tend();        tend();        t = tval();        printf("%s: ",applname);        printf("%d calls to tend() = %8.3f seconds %8.3f usec/call/n",            count,            t,            (t/( (double) count ))*1E6);        return 0;    }

Everything is ready. Compile the time-timers.cpp program as follows:

Compile time-timers.cpp

Gcc-O2 time-timers.cpp-O time-timers on Linux 2000 Cl-O2 time-timers.cpp-O time-timers.exe

This program only uses one optional variable. By default, the program calls the tend () function 100,000 times. Repeat this program to ensure that the time result is generated again. We use the default count and run the program 10 times. The results are displayed in the Linux 2.2.16, Linux 2.4.2, and Windows 2000 tables.

In Linux 2.2.16, Linux 2.4.2, and Windows 2000 on the same ThinkPad, I run the following script. In fact, I initially used the symmetric multiprocessing (SMP) version of Linux 2.4.2. I accidentally used the SMP version for building and testing. I also built a single-processor version and used it for testing. The results of these two versions are summarized below (if you are interested ).

Run the running time-timers script

       ver > time-timers.out    for i in 1 2 3 4 5 6 7 8 9 10    do        time-timers 1m    done >> time-timers.out    for i in 1 2 3 4 5 6 7 8 9 10    do        time-timers 1m    done >> time-timers.out

This script will run tend () for 20 times for 1 million calls. The result is as follows:

Linux 2.2.16 Linux 2.4.2 Linux 2.4.2 SMP Windows 2000
0.740 USEC 0.729 USEC 0.806 USEC 1.945 USEC

The only conclusion I can draw is that the queryperformancecounter () system calling in winsows 2000 is much slower than the gettimeofday () API on the same hardware. For our purpose, the 2 microsecond granularity of timing routines is sufficient. In 1 ms measurement time, only 2‰ is the actual measurement overhead. 0.2% is an acceptable range for our purpose.



Back to Top

Conclusion

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.