<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [X86] Failure to recognise generic shift from SSE PSLLQ intrinsic"
   href="https://bugs.llvm.org/show_bug.cgi?id=50123">50123</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[X86] Failure to recognise generic shift from SSE PSLLQ intrinsic
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Scalar Optimizations
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>llvm-dev@redking.me.uk
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>david.bolvansky@gmail.com, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, nikita.ppv@gmail.com, spatel+llvm@rotateright.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>We often fail to recognise that the _mm_sll_epi64 shift amount is in bounds,
allowing us to fold it to a generic shift. This happens for all SSE 'shift by
uniform variable' intrinsics.

<a href="https://simd.godbolt.org/z/vKT9YE5zM">https://simd.godbolt.org/z/vKT9YE5zM</a>

#include <x86intrin.h>

__m128i shl_v2i64_mod31(__m128i val, __m128i amt) {
  amt = _mm_and_si128( amt, _mm_set1_epi32( 31 ) );
  return _mm_sll_epi64( val, _mm_unpacklo_epi32( amt, _mm_setzero_si128() ) );
}

__m128i shl_v2i64_mod31_alt(__m128i val, __m128i amt) {
  amt = _mm_and_si128( amt, _mm_setr_epi32( 31, 0, 0, 0 ) );
  return _mm_sll_epi64( val, amt );
}

define <2 x i64> @shl_v2i64_mod31(<2 x i64> %0, <2 x i64> %1){
  %3 = bitcast <2 x i64> %1 to <4 x i32>
  %4 = and <4 x i32> %3, <i32 31, i32 poison, i32 poison, i32 poison>
  %5 = insertelement <4 x i32> %4, i32 0, i32 1
  %6 = bitcast <4 x i32> %5 to <2 x i64>
  %7 = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %0, <2 x i64> %6)
  ret <2 x i64> %7
}
declare <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64>, <2 x i64>)

define <2 x i64> @shl_v2i64_mod31_alt(<2 x i64> %0, <2 x i64> %1) {
  %3 = and <2 x i64> %1, <i64 31, i64 poison>
  %4 = shufflevector <2 x i64> %3, <2 x i64> poison, <2 x i32> zeroinitializer
  %5 = shl <2 x i64> %0, %4
  ret <2 x i64> %5
}

I think we're just missing some bitcast/insertelement vector handling in
ValueTracking.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>