<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [X86] SimplifyDemandedBits fails to remove a zero vector input from unpckh"
   href="https://bugs.llvm.org/show_bug.cgi?id=39549">39549</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[X86] SimplifyDemandedBits fails to remove a zero vector input from unpckh
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>craig.topper@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>For this IR sequence

define <8 x i16> @foo(<16 x i8> %x) {
  %a = shufflevector <16 x i8> %x, <16 x i8> undef, <16 x i32> <i32 8, i32
undef, i32 9, i32 undef, i32 10, i32 undef, i32 11, i32 undef, i32 12, i32
undef, i32 13, i32 undef, i32 14, i32 undef, i32 15, i32 undef>
  %b = bitcast <16 x i8> %a to <8 x i16>
  %c = shl <8 x i16> %b, <i16 8, i16 8, i16 8, i16 8, i16 8, i16 8, i16 8, i16
8>
  %d = ashr <8 x i16> %c, <i16 8, i16 8, i16 8, i16 8, i16 8, i16 8, i16 8, i16
8>
  ret <8 x i16> %d
}

We generate this assembly

        pxor    %xmm1, %xmm1
        punpckhbw       %xmm0, %xmm1    ## xmm1 =
xmm1[8],xmm0[8],xmm1[9],xmm0[9],xmm1[10],xmm0[10],xmm1[11],xmm0[11],xmm1[12],xmm0[12],xmm1[13],xmm0[13],xmm1[14],xmm0[14],xmm1[15],xmm0[15]
        psraw   $8, %xmm1


But the pxor is unnecessary. The elements being zeroed aren't consumed by the
psraw. We could just use %xmm0 for both inputs of the unpckh.

Even D54069 which adds SimplifyDemandedBits support for X86ISD::VPSRAI doesn't
help with this.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>