<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>

</span> changed

          <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED INVALID - Possible Regression: std::max slower than std::fmax"

   href="https://bugs.llvm.org/show_bug.cgi?id=25566">bug 25566</a>

          <br>

             <table border="1" cellspacing="0" cellpadding="8">

          <tr>

            <th>What</th>

            <th>Removed</th>

            <th>Added</th>

          </tr>

         <tr>

           <td style="text-align:right;">Resolution</td>

           <td>---

           </td>

           <td>INVALID

           </td>

         </tr>

         <tr>

           <td style="text-align:right;">Status</td>

           <td>NEW

           </td>

           <td>RESOLVED

           </td>

         </tr></table>

      <p>

        <div>

            <b><a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED INVALID - Possible Regression: std::max slower than std::fmax"

   href="https://bugs.llvm.org/show_bug.cgi?id=25566#c8">Comment # 8</a>

              on <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED INVALID - Possible Regression: std::max slower than std::fmax"

   href="https://bugs.llvm.org/show_bug.cgi?id=25566">bug 25566</a>

              from <span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>

</span></b>

        <pre>I'm resolving this bug because I'm not sure if the earlier comments are

meaningful anymore.

For the question of "why is std::fmax slower than std::max?", there's a

functional difference between those functions to account for that - fmax

requires dealing with NAN values. 

On x86 with SSE, the asm looks like this:

fmax:

        andpd   %xmm2, %xmm3

        maxsd   %xmm1, %xmm2

        andnpd  %xmm2, %xmm0

        orpd    %xmm3, %xmm0

max:

        maxsd   %xmm1, %xmm0

There are a number of potential perf optimizations (unrolling, vectorization,

etc) that could be done to make either of these cases better, but we should

file new bugs for those if that's important.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>