Here we compare the code generated by the gcc compiler, icc compiler and dco optimized code generated by the gcc compiler. See Results. Jump to Conclusions.

To generate executables we used:
We used the C version of the Livermore loops benchmark. The code was modified to eliminate calibration, thus ensuring that on every run the same number of iterations are executed on the same input data. This makes it possible to compare the execution times of the program ( and not the estimate amount of MFlops as in the original implementation ). As we are comparing the quality of the generated code and not the quality of the library routines, the kernel 22, testing the performance of a library function, was removed from the study.

All benchmarks were executed under Fedora Linux operating system running on the 2.8GHz Pentium4 computer with 512MB RAM installed. It was ensured that benchmarks run under the same conditions on the system with the minimal possible load.

Every benchmark was executed 3 times with the time reported being neither the best nor the worst.

Results

The following table presents collected execution data.
Jump to Conclusions.

The columns under gcc, gcc+dco and icc headers present execution times ( in seconds ) achieved by the
gcc generated code, dco optimized code and icc generated code respectively. The column under the gcc+dco/gcc header lists the improvements achieved by utilizing dco over the gcc generated code. The column under the icc/gcc shows how much faster is icc generated code than gcc generated code ( or slower if the number is negative ). The column under the icc/gcc+dco shows how much faster is icc generated code than dco optimized code ( or slower if the number is negative ). The best results are shown in this color ( considering the results with the difference falling in the range from -5% to 5% to be "the same" ).


Kernel# gcc
gcc+dco icc gcc+dco/ icc/ icc/




gcc gcc gcc+dco
1 4.97 3 2.96 39.64% 40.44% 1.33%
2 2.38 2.34 2.28 1.68% 4.20% 2.56%
3 5.93 2.33 2.54 60.71% 57.17% -9.01%
4 4.66 3.79 4.63 18.67% 0.64% -22.16%
5 5.2 1.75 2.38 66.35% 54.23% -36.0%
6 4.53 3.55 3.87 21.63% 14.57% -9.01%
7 4.87 3.12 2.57 35.93% 47.23% 17.63%
8 5 3.87 3.25 22.60% 35.00% 16.02%
9 4.6 3.86 4.97 16.09% -8.04% -28.76%
10 4.94 3.38 4.32 31.58% 12.55% -27.81%
11 5.78 0.93 1.65 83.91% 71.45% -77.42%
12 5.18 4.42 4.13 14.67% 20.27% 6.56%
13 4.57 4.62 4.61 -1.09% -0.88% 0.22%
14 4.71 4.12 2.3 12.53% 51.17% 44.17%
15 3.72 3.73 3.67 -0.27% 1.34% 1.61%
16 5.61 5.32 5.66 5.17% -0.89% -6.39%
17 5.01 4.98 4.86 0.60% 2.99% 2.41%
18 4.7 3.74 3.45 20.43% 26.6% 7.75%
19 5.81 4.1 6.77 29.43% -16.52% -65.12%
20 4.53 4.43 4.38 2.21% 3.31% 1.13%
21 4.88 4.6 1.05 5.74% 78.48% 77.17%
23 4.17 3.85 4.67 7.67% -11.99% -21.3%
24 4.85 0.78 1.66 83.92% 65.77% -112.82%
Geometric Mean 4.74 3.21 3.29 32.33% 30.56% -2.61%

Conclusions

icc
generated code is, on average, 3% slower than dco optimized code and 31% faster than gcc generated code. dco optimized code is, on average, 32% faster than gcc generated code ( see this for the results of a slightly different study ).

In 6 ( out of 23 ) cases icc generated faster code, in 11 cases dco generated code was faster and in 6 cases icc and dco generated code of the same complexity: