<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 09/05/2013 04:26 PM, Halfdan Ingvarsson wrote:<br>

    <blockquote cite="mid:5228E905.3030003@sidefx.com" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">Same applies to exp2f, btw, since

        they have fairly very similar implementation.<br>

        <br>

         - &frac12;<br>

        <br>

        On 13-09-05 03:55 PM, Halfdan Ingvarsson wrote:<br>

      </div>

      <blockquote cite="mid:5228E1A4.9040300@sidefx.com" type="cite">

        <meta content="text/html; charset=ISO-8859-1"

          http-equiv="Content-Type">

        <div class="moz-cite-prefix">glibc's expf() function changes the

          FP rounding mode on every call -- which are the fe* calls

          you're seeing -- resulting in a dreadful performance (IIRC

          there's a pipeline stall when rounding mode changes).<br>

          <br>

          Have a look at sysdeps/ieee754/flt-32/e_expf.c in the glibc

          sources to verify. This is true as of glibc 2.14, at least.<br>

          <br>

          We had to roll our own to work around it.<br>

          <br>

           - &frac12;<br>

          <br>

          On 13-09-05 03:33 PM, Stephen Canon wrote:<br>

        </div>

        <blockquote

          cite="mid:894741D6-06A5-473D-883F-083548EAED9D@apple.com"

          type="cite">

          <meta http-equiv="Content-Type" content="text/html;

            charset=ISO-8859-1">

          <div>On Sep 5, 2013, at 12:20 PM, Eli Friedman <<a

              moz-do-not-send="true"

              href="mailto:eli.friedman@gmail.com">eli.friedman@gmail.com</a>>


            wrote:</div>

          <div><br class="Apple-interchange-newline">

            <blockquote type="cite">

              <div dir="ltr">On Thu, Sep 5, 2013 at 12:15 PM, Richard

                Hadsell <span dir="ltr"><<a moz-do-not-send="true"

                    href="mailto:hadsell@blueskystudios.com"

                    target="_blank">hadsell@blueskystudios.com</a>></span>

                wrote:<br>

                <div class="gmail_extra">

                  <div class="gmail_quote">

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We


                      have been comparing the performance of code

                      generated by Clang++ 3.3 with G++ 4.5.1.  The

                      results have been mixed.<br>

                      <br>

                      We ran a profiler to look for what could cause

                      some cases to run slower with Clang++ and found

                      that some floating-point routines were taking a

                      lot of time:<br>

                      <br>

                      samples  %        image name     symbol name<br>

                      596677   19.7935  studio++       gcopy2<br>

                      274870    9.1182  <a moz-do-not-send="true"

                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>

                        feholdexcept<br>

                      262358    8.7032  <a moz-do-not-send="true"

                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>

                        fesetenv<br>

                      258225    8.5661  studio++       cgi...<br>

                      207915    6.8971  <a moz-do-not-send="true"

                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>

                        fesetround<br>

                      193316    6.4129  studio++       dcopy2<br>

                      <a moz-do-not-send="true"

                        href="tel:126933%20%20%20%204.2107"

                        value="+12693342107" target="_blank">126933

                        4.2107</a>  <a moz-do-not-send="true"

                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>

                        __ieee754_exp2<br>

                      122614    4.0675  studio++       fcopy2<br>

                      <br>

                      For g++ the top contributors were these:<br>

                      <br>

                      samples  %        image name     symbol name<br>

                      466893   21.3064  studio++       gcopy2<br>

                      300240   13.7013  studio++       cgi...<br>

                      176191    8.0404  studio++       dcopy2<br>

                      132491    6.0462  studio++       cgi...<br>

                      129580    5.9133  <a moz-do-not-send="true"

                        href="http://libm-2.13.so/" target="_blank">libm-2.13.so</a>

                        __ieee754_pow<br>

                      <a moz-do-not-send="true"

                        href="tel:126938%20%20%20%205.7928"

                        value="+12693857928" target="_blank">126938

                        5.7928</a>  studio++       ecopy2<br>

                      119610    5.4583  studio++       fcopy2<br>

                      <br>

                      The libm floating-point routines 'fe...' only show

                      up with Clang++, so I suspect they account for the

                      slower performance.<br>

                      <br>

                      We are not purposely changing the floating-point

                      precision or rounding mode, so I am looking for a

                      way to avoid code that uses these functions

                      unnecessarily.<br>

                      <br>

                      We are compiling with these options:<br>

                      <br>

                      -march=core2 -msse4.1 -m64 -std=c++0x -fPIC

                      -pthread -gcc-toolchain /opt/gcc-4.7.2

                      -Wno-logical-op-parentheses

                      -Wno-shift-op-parentheses -O2<span class=""><font

                          color="#888888"><br>

                          <br>

                        </font></span></blockquote>

                    <div><br>

                    </div>

                    <div>There isn't any obvious reason why feholdexcept

                      etc. would be called from clang-compiled code, but

                      not gcc-compiled code; clang never generates calls

                      to it implicitly.</div>

                    <div><br>

                    </div>

                    <div>Can you hop into a debugger and get a stack

                      trace from a call to feholdexcept?</div>

                  </div>

                </div>

              </div>

            </blockquote>

            <br>

          </div>

          <div>

            <div>Usually the reason these symbols show up on linux is

              that you’re hitting the errno-versions of the libm entry

              points (i.e. GCC is likely generating calls to a different

              set of more streamlined libm entry points, while clang is

              hitting the default versions).</div>

            <div><br>

            </div>

            <br>

          </div>

        </blockquote>

      </blockquote>

      <br>

    </blockquote>

    Thanks for all the clues.  Here is the stack trace:<br>

    <pre> feholdexcept,

 __ieee754_exp2,

 exp2,

 _ZN9cgi...

</pre>

    Based on your various hints, I'm guessing that our code 'pow (2.0,

    x)' is being optimized by Clang++ to 'exp2 (x)' and not by G++.  We

    will try using exp2 explicitly and see what happens with the G++

    version.<br>

    <br>

    Perhaps we are running into a floating-point standards issue that

    our old version of G++ is ignoring.<br>

    <br>

    We'll continue investigating tomorrow.<br>

  </body>

</html>