<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - x86 SSE2&AVX2 regression, unfortunate instruction selection for vectorized i8 and i16 comparisons due to overzealous conversion to unsigned."

   href="https://bugs.llvm.org/show_bug.cgi?id=47448">47448</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>x86 SSE2&AVX2 regression, unfortunate instruction selection for vectorized i8 and i16 comparisons due to overzealous conversion to unsigned.

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>ToHe_EMA@gmx.de

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>While writing custom code to handle vectorized loop remainder I noticed that

the current LLVM trunk sometimes turns "signed greater than" SSE2-intrinsics

into "unsigned greater than or equal" which are not directly supported by SSE2.

The problem occurs with both i8 and i16 integer vectors and with SSE2 as well

as AVX2.

Here is some C code that illustrates the problem:

#include <emmintrin.h>

__m128i BadCompare(short value)

{

    return _mm_cmpgt_epi16(

        _mm_set1_epi16(value & 7),

        _mm_setr_epi16(0, 1, 2, 3, 4, 5, 6, 7));

}

This compiles to the following assembly (<a href="https://gcc.godbolt.org/z/nx5Pn4">https://gcc.godbolt.org/z/nx5Pn4</a>):

.LCPI0_0:

        .short  1

        .short  2

        .short  3

        .short  4

        .short  5

        .short  6

        .short  7

        .short  8

BadCompare(short):

        and     edi, 7

        movd    xmm0, edi

        pshuflw xmm0, xmm0, 0

        pshufd  xmm0, xmm0, 0

        movdqa  xmm1, xmmword ptr [rip + .LCPI0_0]

        psubusw xmm1, xmm0                            # !!!

        pxor    xmm0, xmm0                            # !!!

        pcmpeqw xmm0, xmm1                            # !!!

        ret

The generated code is suboptimal because "pcmpgt" should be used as requested

instead of incrementing all numbers and using "psubusw" and "pcmpeqw". 

It seems to me that the regression is caused by some earlier pass in LLVM now

converting "icmp sgt" into "icmp ugt" which is indeed a valid transformation

because of the "& 7". For i32 and i64 integer vectors the x86 backend appears

to undo this conversion. However this step is either broken or missing in the

i8 and i16 case.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>