<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Incorrect Cost Calculations for Shuffle Ports on Icelake-client / Icelake-server"
   href="https://bugs.llvm.org/show_bug.cgi?id=48110">48110</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Incorrect Cost Calculations for Shuffle Ports on Icelake-client / Icelake-server
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>tools
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>llvm-mca
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>dweber@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>andrea.dibiagio@gmail.com, llvm-bugs@lists.llvm.org, matthew.davis@sony.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Hi there,

It looks like llvm-mca is treating icelake-client and icelake-server as having
only one shuffle port.  This causes incorrect cost calculations for any
operation that would use said shuffle ports.  A block diagram showing the
architecture can be found here:
<a href="https://en.wikichip.org/wiki/intel/microarchitectures/sunny_cove">https://en.wikichip.org/wiki/intel/microarchitectures/sunny_cove</a> (sunny cove is
the core used by both icelake-server and icelake-client).  Similarly, you can
find the uops usage for any given shuffle here: 
<a href="https://uops.info/table.html?search=shuf&cb_lat=on&cb_tp=on&cb_uops=on&cb_ports=on&cb_SKL=on&cb_ICL=on&cb_measurements=on&cb_base=on&cb_avx=on">https://uops.info/table.html?search=shuf&cb_lat=on&cb_tp=on&cb_uops=on&cb_ports=on&cb_SKL=on&cb_ICL=on&cb_measurements=on&cb_base=on&cb_avx=on</a>

If the code generating the costs for LLVM-MCA is used by anything else in the
toolchain, it's likely this will yield a performance benefit for other users
targeting icelake-server and icelake-client.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>