<div dir="ltr">On Thu, Sep 5, 2013 at 12:15 PM, Richard Hadsell <span dir="ltr"><<a href="mailto:hadsell@blueskystudios.com" target="_blank">hadsell@blueskystudios.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We have been comparing the performance of code generated by Clang++ 3.3 with G++ 4.5.1. The results have been mixed.<br>
<br>
We ran a profiler to look for what could cause some cases to run slower with Clang++ and found that some floating-point routines were taking a lot of time:<br>
<br>
samples % image name symbol name<br>
596677 19.7935 studio++ gcopy2<br>
274870 9.1182 <a href="http://libm-2.13.so" target="_blank">libm-2.13.so</a> feholdexcept<br>
262358 8.7032 <a href="http://libm-2.13.so" target="_blank">libm-2.13.so</a> fesetenv<br>
258225 8.5661 studio++ cgi...<br>
207915 6.8971 <a href="http://libm-2.13.so" target="_blank">libm-2.13.so</a> fesetround<br>
193316 6.4129 studio++ dcopy2<br>
<a href="tel:126933%20%20%20%204.2107" value="+12693342107" target="_blank">126933 4.2107</a> <a href="http://libm-2.13.so" target="_blank">libm-2.13.so</a> __ieee754_exp2<br>
122614 4.0675 studio++ fcopy2<br>
<br>
For g++ the top contributors were these:<br>
<br>
samples % image name symbol name<br>
466893 21.3064 studio++ gcopy2<br>
300240 13.7013 studio++ cgi...<br>
176191 8.0404 studio++ dcopy2<br>
132491 6.0462 studio++ cgi...<br>
129580 5.9133 <a href="http://libm-2.13.so" target="_blank">libm-2.13.so</a> __ieee754_pow<br>
<a href="tel:126938%20%20%20%205.7928" value="+12693857928" target="_blank">126938 5.7928</a> studio++ ecopy2<br>
119610 5.4583 studio++ fcopy2<br>
<br>
The libm floating-point routines 'fe...' only show up with Clang++, so I suspect they account for the slower performance.<br>
<br>
We are not purposely changing the floating-point precision or rounding mode, so I am looking for a way to avoid code that uses these functions unnecessarily.<br>
<br>
We are compiling with these options:<br>
<br>
-march=core2 -msse4.1 -m64 -std=c++0x -fPIC -pthread -gcc-toolchain /opt/gcc-4.7.2 -Wno-logical-op-parentheses -Wno-shift-op-parentheses -O2<span class=""><font color="#888888"><br>
<br></font></span></blockquote><div><br></div><div>There isn't any obvious reason why feholdexcept etc. would be called from clang-compiled code, but not gcc-compiled code; clang never generates calls to it implicitly.</div>
<div><br></div><div>Can you hop into a debugger and get a stack trace from a call to feholdexcept?</div></div><br></div><div class="gmail_extra">-Eli</div></div>