<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    On 09/05/2013 04:26 PM, Halfdan Ingvarsson wrote:<br>
    <blockquote cite="mid:5228E905.3030003@sidefx.com" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      <div class="moz-cite-prefix">Same applies to exp2f, btw, since
        they have fairly very similar implementation.<br>
        <br>
         - &frac12;<br>
        <br>
        On 13-09-05 03:55 PM, Halfdan Ingvarsson wrote:<br>
      </div>
      <blockquote cite="mid:5228E1A4.9040300@sidefx.com" type="cite">
        <meta content="text/html; charset=ISO-8859-1"
          http-equiv="Content-Type">
        <div class="moz-cite-prefix">glibc's expf() function changes the
          FP rounding mode on every call -- which are the fe* calls
          you're seeing -- resulting in a dreadful performance (IIRC
          there's a pipeline stall when rounding mode changes).<br>
          <br>
          Have a look at sysdeps/ieee754/flt-32/e_expf.c in the glibc
          sources to verify. This is true as of glibc 2.14, at least.<br>
          <br>
          We had to roll our own to work around it.<br>
          <br>
           - &frac12;<br>
          <br>
          On 13-09-05 03:33 PM, Stephen Canon wrote:<br>
        </div>
        <blockquote
          cite="mid:894741D6-06A5-473D-883F-083548EAED9D@apple.com"
          type="cite">
          <meta http-equiv="Content-Type" content="text/html;
            charset=ISO-8859-1">
          <div>On Sep 5, 2013, at 12:20 PM, Eli Friedman <<a
              moz-do-not-send="true"
              href="mailto:eli.friedman@gmail.com">eli.friedman@gmail.com</a>>


            wrote:</div>
          <div><br class="Apple-interchange-newline">
            <blockquote type="cite">
              <div dir="ltr">On Thu, Sep 5, 2013 at 12:15 PM, Richard
                Hadsell <span dir="ltr"><<a moz-do-not-send="true"
                    href="mailto:hadsell@blueskystudios.com"
                    target="_blank">hadsell@blueskystudios.com</a>></span>
                wrote:<br>
                <div class="gmail_extra">
                  <div class="gmail_quote">
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We


                      have been comparing the performance of code
                      generated by Clang++ 3.3 with G++ 4.5.1.  The
                      results have been mixed.<br>
                      <br>
                      We ran a profiler to look for what could cause
                      some cases to run slower with Clang++ and found
                      that some floating-point routines were taking a
                      lot of time:<br>
                      <br>
                      samples  %        image name     symbol name<br>
                      596677   19.7935  studio++       gcopy2<br>
                      274870    9.1182  <a moz-do-not-send="true"
                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>
                        feholdexcept<br>
                      262358    8.7032  <a moz-do-not-send="true"
                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>
                        fesetenv<br>
                      258225    8.5661  studio++       cgi...<br>
                      207915    6.8971  <a moz-do-not-send="true"
                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>
                        fesetround<br>
                      193316    6.4129  studio++       dcopy2<br>
                      <a moz-do-not-send="true"
                        href="tel:126933%20%20%20%204.2107"
                        value="+12693342107" target="_blank">126933
                        4.2107</a>  <a moz-do-not-send="true"
                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>
                        __ieee754_exp2<br>
                      122614    4.0675  studio++       fcopy2<br>
                      <br>
                      For g++ the top contributors were these:<br>
                      <br>
                      samples  %        image name     symbol name<br>
                      466893   21.3064  studio++       gcopy2<br>
                      300240   13.7013  studio++       cgi...<br>
                      176191    8.0404  studio++       dcopy2<br>
                      132491    6.0462  studio++       cgi...<br>
                      129580    5.9133  <a moz-do-not-send="true"
                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>
                        __ieee754_pow<br>
                      <a moz-do-not-send="true"
                        href="tel:126938%20%20%20%205.7928"
                        value="+12693857928" target="_blank">126938
                        5.7928</a>  studio++       ecopy2<br>
                      119610    5.4583  studio++       fcopy2<br>
                      <br>
                      The libm floating-point routines 'fe...' only show
                      up with Clang++, so I suspect they account for the
                      slower performance.<br>
                      <br>
                      We are not purposely changing the floating-point
                      precision or rounding mode, so I am looking for a
                      way to avoid code that uses these functions
                      unnecessarily.<br>
                      <br>
                      We are compiling with these options:<br>
                      <br>
                      -march=core2 -msse4.1 -m64 -std=c++0x -fPIC
                      -pthread -gcc-toolchain /opt/gcc-4.7.2
                      -Wno-logical-op-parentheses
                      -Wno-shift-op-parentheses -O2<span class=""><font
                          color="#888888"><br>
                          <br>
                        </font></span></blockquote>
                    <div><br>
                    </div>
                    <div>There isn't any obvious reason why feholdexcept
                      etc. would be called from clang-compiled code, but
                      not gcc-compiled code; clang never generates calls
                      to it implicitly.</div>
                    <div><br>
                    </div>
                    <div>Can you hop into a debugger and get a stack
                      trace from a call to feholdexcept?</div>
                  </div>
                </div>
              </div>
            </blockquote>
            <br>
          </div>
          <div>
            <div>Usually the reason these symbols show up on linux is
              that you’re hitting the errno-versions of the libm entry
              points (i.e. GCC is likely generating calls to a different
              set of more streamlined libm entry points, while clang is
              hitting the default versions).</div>
            <div><br>
            </div>
            <br>
          </div>
        </blockquote>
      </blockquote>
      <br>
    </blockquote>
    Thanks for all the clues.  Here is the stack trace:<br>
    <pre> feholdexcept,
 __ieee754_exp2,
 exp2,
 _ZN9cgi...
</pre>
    Based on your various hints, I'm guessing that our code 'pow (2.0,
    x)' is being optimized by Clang++ to 'exp2 (x)' and not by G++.  We
    will try using exp2 explicitly and see what happens with the G++
    version.<br>
    <br>
    Perhaps we are running into a floating-point standards issue that
    our old version of G++ is ignoring.<br>
    <br>
    We'll continue investigating tomorrow.<br>
  </body>
</html>