[cfe-dev] Clang build of ATLAS (and speed comparison)
Clint Whaley
whaley at cs.utsa.edu
Thu Sep 8 08:55:39 PDT 2011
Vincent,
Hi there,
>As a preliminary report, here is what spits out the ATLAS built-in benchmark
>(‘make time’) after a compilation with the -Oz flag. Reference stands for an
>installation presumably with GCC4.x on Linux (Clint, could you elaborate on
>this?). Machine is a 3-year old MacBook (late 2008, Core 2 Duo), compiler is
>the version of clang shipped with latest Xcode 4.2:
I would give them the table you built before, where you contrast Clang and
GCC4.5 *on the same machine*. The numbers in the default compare against
timings ran on my own Core2 system, which may have strong differences
from yours (different cache, memory, OS, and compiler).
>Dragonegg with gcc4.5 is used for fortran compilation, but I don’t think it is
>relevant here (again, Clint, could you confirm this ?). Anyway, it is pure
>LLVM output code.
Fortran is only used to compile interface files, and has no affect on
performance.
>What we see is that, while clang seems to outperform GCC on level2 BLAS ops
>(matrix • vector), it is consistently 20 % inferior on level3 ops (lines 2, 3
>and 4).
Most of the L2BLAS use intrinsics or assembly, so the compiler is not as
important for these lines. The lines that rely on the compiler for performance
are kGenMM, kMM_NT, and kMM_TN.
>Please note, and this is also important, that neither at -O0, nor at -O3 does
That is odd indeed: I've never heard -O0 failing while higher optimization
works . . .
Cheers,
Clint
**************************************************************************
** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley **
**************************************************************************
More information about the cfe-dev
mailing list