Currently on the market, mainstream C/C ++ compilers include CL, gcc of M $, icl of Intel, pgcc of PGI, and bcc of Codegear (originally belonging to Borland ). Cl is the most widely used in Windows, while gcc is the first choice for C/C ++ compilers on a broader platform. However, when it comes to capability optimization, the ranking may not be consistent with their market share.
Today, we have made a comparison of the numerical performance of each compiler. The test code is a program for credit calculation. It comes from the example program of the intel compiler and modifies a header file so that each compiler can compile.
# Include <stdio. h>
# Include <stdlib. h>
# Include <time. h>
# Include <math. h>
// Function to be integrated
// Define and prototype it here
// | Sin (x) |
# Define INTEG_FUNC (x) fabs (sin (x ))
// Prototype timing function
Double dclock (void );
Int main (void)
{
// Loop counters and number of interior points
Unsigned int I, j, N;
// Stepsize, independent variable x, and accumulated sum
Double step, x_ I, sum;
// Timing variables for evaluation
Double start, finish, duration, clock_t;
// Start integral from
Double maid = 0.0;
// Complete integral
Double interval_end = 2.0*3.141592653589793238;
// Start timing for the entire application
Start = clock ();
Printf ("\ n ");
Printf ("Number of | Computed Integral | \ n ");
Printf ("Interior Points | \ n ");
For (j = 2; j <27; j ++)
{
Printf ("------------------------------------- \ n ");
// Compute the number of (internal rectangles + 1)
N = 1 <j;
// Compute stepsize for N-1 internal rectangles
Step = (interval_end-interval_begin)/N;
// Approx. 1/2 area in first rectangle: f (x0) * [step/2]
Sum = INTEG_FUNC (interval_begin) * step/2.0;
// Apply midpoint rule:
// Given length = f (x), compute the area of
// Rectangle of width step
// Sum areas of internal rectangle: f (xi + step) * step
For (I = 1; I <N; I ++)
{
X_ I = I * step;
Sum + = INTEG_FUNC (x_ I) * step;
}
// Approx. 1/2 area in last rectangle: f (xN) * [step/2]
Sum + = INTEG_FUNC (interval_end) * step/2.0;
Printf ("% 10d | % 14e | \ n", N, sum );
}
Finish = clock ();
Duration = (finish-start );
Printf ("\ n ");
Printf ("Application Clocks = % 10e \ n", duration );
Printf ("\ n ");
Return 0;
}
Of course, this code is from intel, and of course it is very suitable for intel compilers. The following tests are performed on Intel Core 2 Duo.
Gcc (GCC TDM-2 for MinGW) 4.3.0 VC 9.0 (cl 15.00.21022.08) Intel (icl 10.1) PGI (pgcc 7.16) CodeGear (bcc32 6.10)
Optimization prohibited
-O0/Od-O0-Od
17161 14461 12441 10514 13400
17133 14430 11687 9956 12917
17155 14476 11871 10099 13026
Compilation option-O2
13011 7737 4540 9348 12636
16571 7706 4185 9148 13026
16573 7706 4042 9183 13057
Platform Optimization
-March = core2-O2/arch: SSE2/O2-QxT-tp core2-O2 none
16060 7710 1938 9578
The test results show that intel compiler is very interested in the numerical calculation method, especially for the optimization of a certain CPU, which can improve a lot of performance. GCC is a bit disappointing. In the comparison of prohibiting optimization to-O2-level optimization, we can see that the optimization effects of intel and m $ compilers are very obvious, while the improvements after optimization by other compilers are very limited. If you give a ranking, it will be icl> cl> pgcc> bcc> gcc.
In addition, in a linux environment on a P4 1.5g server, the test results are as follows:
Gcc icc pgCC
-O2-O2-O2
24920000 10840000 22270000
-O0-O0-O0
28290000 19210000 24320000
-March = pentium4-O2-xN-tp piv-O2
24990000 6640000 22150000
Similarly, intel is the best, while gcc is the worst.
In addition, we tested Athlon X2 4800 + on Linux. The following table is displayed.
Gcc icc pgcc
-O0-O0-O0
9390000 14950000 9950000
-O2-O2-O2
8910000 9240000 9400000
-March = amdfam10-O2-msse3-O2-tp k8-32-O2
8800000 3800000 9030000
Although icc is mainly for intel processors, as long as the optimization options are correct, it can also greatly improve amd cpu performance. Gcc also returns to the normal level. The strange thing is that the pgi compiler is that I haven't found any good options yet.
In conclusion, in the numerical calculation method, the "fastest" choice should belong to intel.