<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Significant performance regression with r291800 ("Tune bypassing of slow division for Intel CPUs")"

   href="https://bugs.llvm.org/show_bug.cgi?id=35226">35226</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Significant performance regression with r291800 ("Tune bypassing of slow division for Intel CPUs")

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>atdt@google.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>rL291800 (<a href="https://reviews.llvm.org/rL291800">https://reviews.llvm.org/rL291800</a>) took an optimization for lowering

64-bit division to 32-bit and enabled it on all Intel big cores, starting with

Sandy Bridge. This change is associated with a significant regression in an

internal, search-related benchmark, when compiled for -march=haswell.

In the differential revision (<a href="https://reviews.llvm.org/D28196">https://reviews.llvm.org/D28196</a>), the reviewer

pointed out that the fact that the latency/throughput of 64-bit division falls

along a range suggests that this optimization may already be done in hardware.

Do we know whether this is true or this is true? Additionally, is it possible

that improvements in the latency and throughput of division and remainder

operations on recent big core Intel CPUs render this optimization obsolete?</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>