[cfe-dev] Floating-point performance question
Richard Hadsell
hadsell at blueskystudios.com
Thu Sep 5 12:15:23 PDT 2013
We have been comparing the performance of code generated by Clang++ 3.3 with G++ 4.5.1. The results have been mixed.
We ran a profiler to look for what could cause some cases to run slower with Clang++ and found that some floating-point routines were taking a lot of time:
samples % image name symbol name
596677 19.7935 studio++ gcopy2
274870 9.1182 libm-2.13.so feholdexcept
262358 8.7032 libm-2.13.so fesetenv
258225 8.5661 studio++ cgi...
207915 6.8971 libm-2.13.so fesetround
193316 6.4129 studio++ dcopy2
126933 4.2107 libm-2.13.so __ieee754_exp2
122614 4.0675 studio++ fcopy2
For g++ the top contributors were these:
samples % image name symbol name
466893 21.3064 studio++ gcopy2
300240 13.7013 studio++ cgi...
176191 8.0404 studio++ dcopy2
132491 6.0462 studio++ cgi...
129580 5.9133 libm-2.13.so __ieee754_pow
126938 5.7928 studio++ ecopy2
119610 5.4583 studio++ fcopy2
The libm floating-point routines 'fe...' only show up with Clang++, so I suspect they account for the slower performance.
We are not purposely changing the floating-point precision or rounding mode, so I am looking for a way to avoid code that uses these functions unnecessarily.
We are compiling with these options:
-march=core2 -msse4.1 -m64 -std=c++0x -fPIC -pthread -gcc-toolchain /opt/gcc-4.7.2 -Wno-logical-op-parentheses -Wno-shift-op-parentheses -O2
--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: hadsell at blueskystudios.com
Blue Sky Studios http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560
More information about the cfe-dev
mailing list