Basically, it has been my experience that egcs gets much worse performance than gnu gcc on dec alphas. This has held true for me through several releases of gcc & egcs, so you can imagine that I was less than thrilled with the announcement that egcs was taking over gcc . . .
For what it's worth, gcc 2.95.1 maintained by egcs gets less performance than gcc 2.8.1 did, though still not nearly as bad as egcs . . .
My main project is ATLAS, which involves using a code generator to produce very fast linear algebra kernels using ISO/ANSI C (it's got a BSD-ish license, if you care about that sort of thing). On a 533Mhz Dec ev56, ATLAS sustains a little over 600Mflop for large, out-of-cache matrix multiplies when compiled with gnu gcc 2.8.1. When the best code is generated for egcs, however, a peak of less than 500Mflop is observed.
My suspicion is that at least part of the problem has to do with fetch scheduling, which egcs seems to have optimized for the PPC at the expense of the alpha. I am WAGing this is part of the problem because I have observed egcs running much faster than gcc on a PPC, and a performance killer for egcs/alpha is to throw the
-fschedule-insns -fschedule-insns2flags, which are big performance wins using gcc. Over the course of several egcs releases, I have tried pretty much every compiler flag I could find, and never gotten gnu-level performance
So, I am curious as to whether other alpha users have comparison shopped the two compilers for computation intensive codes, and if so, what their experiences have been. Can someone who knows more about these issues tell me what is difference between egcs & gcc on the alpha? If there is a problem, how does one go about getting the attention of very busy compiler developers to address it? Any help much appreciated; I can be contacted at
rwhaley@cs.utk.edu..So that people can scope the problem out without installing atlas, I put together a small matmul benchmark showing the problem. This small benchmark repeatedly performs an L1-cache-contained matmul, and calculates the mflop rating achieved when the same codes are compiled with egcs & gcc. It times three different matmul algorithms:
Of course, the one *I* care about is the atlasmm, but I found it interesting that even the more commonly seen algorithms perform better with gcc than with egcs.
Here are the results I get on my 533Mhz Dec ev56, using egcs-2.91.66 (1.1.2 release) and gcc 2.8.1:
GCC performance: ./xmm_gcc ALGORITHM NB REPS TIME MFLOPS ========= ===== ===== ========== ========== gemm1x1 28 500 0.328 67.02 gemm4x4 28 500 0.035 627.70 atlasmm 28 500 0.026 845.25 EGCS performance: ./xmm_egc ALGORITHM NB REPS TIME MFLOPS ========= ===== ===== ========== ========== gemm1x1 28 500 0.344 63.82 gemm4x4 28 500 0.055 399.19 atlasmm 28 500 0.033 662.36