[cfe-dev] Floating-point performance question

Thu Sep 5 12:15:23 PDT 2013

We have been comparing the performance of code generated by Clang++ 3.3 with G++ 4.5.1.  The results have been mixed.

We ran a profiler to look for what could cause some cases to run slower with Clang++ and found that some floating-point routines were taking a lot of time:

samples  %        image name     symbol name
596677   19.7935  studio++       gcopy2
274870    9.1182  libm-2.13.so   feholdexcept
262358    8.7032  libm-2.13.so   fesetenv
258225    8.5661  studio++       cgi...
207915    6.8971  libm-2.13.so   fesetround
193316    6.4129  studio++       dcopy2
126933    4.2107  libm-2.13.so   __ieee754_exp2
122614    4.0675  studio++       fcopy2

For g++ the top contributors were these:

samples  %        image name     symbol name
466893   21.3064  studio++       gcopy2
300240   13.7013  studio++       cgi...
176191    8.0404  studio++       dcopy2
132491    6.0462  studio++       cgi...
129580    5.9133  libm-2.13.so   __ieee754_pow
126938    5.7928  studio++       ecopy2
119610    5.4583  studio++       fcopy2

The libm floating-point routines 'fe...' only show up with Clang++, so I suspect they account for the slower performance.

We are not purposely changing the floating-point precision or rounding mode, so I am looking for a way to avoid code that uses these functions unnecessarily.

We are compiling with these options:

-march=core2 -msse4.1 -m64 -std=c++0x -fPIC -pthread -gcc-toolchain /opt/gcc-4.7.2 -Wno-logical-op-parentheses -Wno-shift-op-parentheses -O2

-- 
Dick Hadsell			203-992-6320  Fax: 203-992-6001
Reply-to:			hadsell at blueskystudios.com
Blue Sky Studios                http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560