<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/63091>63091</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] Backend incorrectly combines shuffle and vectorized icmp's of different element widths since 13.0
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Benjins
      </td>
    </tr>
</table>

<pre>
    The following IR is a minimal repro case for the issue ([Godbolt link](https://godbolt.org/z/Y9qMhYz9h)):

```llvm
define dso_local <4 x i32> @do_stuff(<16 x i8> %0, <4 x i32> %1) local_unnamed_addr #0 {
 %3 = icmp sgt <16 x i8> %0, zeroinitializer
  %4 = sext <16 x i1> %3 to <16 x i8>
  %5 = bitcast <16 x i8> %4 to <4 x i32>

  %6 = icmp sgt <4 x i32> %1, zeroinitializer
  %7 = sext <4 x i1> %6 to <4 x i32>
 
  %8 = shufflevector <4 x i32> %5, <4 x i32> %7, <4 x i32> <i32 0, i32 1, i32 2, i32 7>

  ret <4 x i32> %8
}
```

On 12.0.1, this produces the following assembly:
```asm
do_stuff: # @do_stuff
        vpxor   xmm2, xmm2, xmm2
        vpcmpgtb xmm0, xmm0, xmm2
        vpcmpgtd        xmm1, xmm1, xmm2
 vpblendd        xmm0, xmm0, xmm1, 8             # xmm0 = xmm0[0,1,2],xmm1[3]
        ret
```

However, starting from 13.0.0, it produces this instead:
```asm
do_stuff: # @do_stuff
        vpxor   xmm2, xmm2, xmm2
        vpblendd xmm0, xmm0, xmm1, 8             # xmm0 = xmm0[0,1,2],xmm1[3]
 vpcmpgtb        xmm0, xmm0, xmm2
        ret
```
The byte-wise compare and dword-wise compare have been folded together into a single byte-wise compare. However, in some cases the two compares may give different results. In this case, if the last dword in `xmm1` is `0x00000001`, then the dword-wise signed greater than with zero will be `0xFFFFFFFF`, but the byte-wise compare will be `0x000000FF`

The issue seems to come down to `canonicalizeShuffleWithBinOps` in X86ISelLowering.cpp : I'm not sure in what situations the element bit-width of the bin ops needs to be taken into account, but it seems to be causing issues in this case

Original C++ repro, minimized ([Godbolt link](https://godbolt.org/z/YGejEYqqY)):
```cpp
__m128i do_stuff(__m128i I0, __m128i I1) {
        __m128i Zero = {};
        __m128i Cmp01 = _mm_cmpgt_epi8(I0, Zero);
        __m128i Cmp02 = _mm_cmpgt_epi32(I1, Zero);
        return _mm_blend_epi32(Cmp01, Cmp02, 8);
}
```

I have confirmed that this still repros on the latest trunk, 684f3c968d6bbf124014128b9f5e4f03a50f28c5

For triage/priority purposes: this issue was not in code I manually wrote, but was found by a fuzzer meant to test SIMD codegen


</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8Vk1v2zwS_jX0ZRBDoj4sH3yo47prYIsC2wLb7MWgxJHEhiJVkorj_PoFKSuRkzRYLPC-QhBL5DzD-Xw4zFrRKMQNybYk2y3Y4FptNltUv4Syi1Lz8-ZHi1BrKfVJqAYO_wJhgUEnlOiYBIO90VAx64UMuBZBWDsgEFqQbPtF81JLB1Koe5LtCC1a53pLkk-E7gndN-P-UpuG0P0Tofu79e-v7d3TuiV07f-STyTakWj6n0fjn5QP3bjEsRYKgVt9lLpiEkhym8IjiISS5DOQNOL6aN1Q196k5DbO_WYR9mgWEXr7GkGzmNA1BG3HQSnWIT8yzg0QmkRAVtvxZC-ZAEl2IKquB9s4eF__ExotlHCCSfGE5oL222mAW3ycQeMLNAGnrxXOgFkAlsJVzL5zbHrBvrg1j2LQkL-x_E0MPrJ8dWV5Ojc8_8PhMIMXI7wd6lriA1ZOm7cmZO8mZ_XOanIrEgoh2P4lnl7o9LJ6EwKD7zhdXGRWu1f1Nsd-UxDTZbQMp7hWWOiN5kOFNtT_S7Mwa7Er5fmliCd1zE7VO9Vm4sspuSrXi6Hj89A_agMAj10XnLr6fSVZdX3jSr8VXUSiD0X59P3YdfFFMn6FeOhLiYrPRV8rD5AC5o93yUuEbAfRbOvFvSgNhHAbgNk28V9Xxhl0HyThH_qED2j8kdYx43y8a6M7iJNltBwrwc0TIywIZR0y_nen4xK4vzBizyn_c3Lo_xZcz_bl2eHNSViESnc9MwhMceAnbfj1csseEEpE5WueIwenG3QtGhDKaWBghWrkOwqXMEufUGB1h-ESGTvInfQkaaFjZ2jEAwIXdY0GlQODdpDOLuGgxsR6aFBVB7z0nBjs9cpJHoWI5ZG_ukgeRY_R-Pi1sYdRBeDMx3AxcmgMMof-YmMKTsK1gRLhJKSEEkdt-8tz0VYOLih7G8Yr1GjCiJrV9Y_nG9QidtZTaeWjw_VJBV7No4oprUQVKPn7yKD_Fq7dCvWtt8FNBT-L_PAd5T_1CY1QzbLqe_BFfSB01YHSDuxg0EueWubACjcwJ7QaE4ASOx_oUribk-CuBT1GthQKdG9BIfJgW4ng2D2qS8arSg_KTVEQ7sWJ0id48AUxuue7cZa8Ob0a0QjFJNwSuiV0O84YXmcYOsQT8v9_uviCvz7f_f5993q6mLqg6vtx5XjsYloImA0P09IhpPn5KwwLz0MBidbTzn98pfg29purHUneitx2fRQHmWPXHUMTH7EXBaHFeIrXEQx9H0vfYhPqwfEfwAbdYFRABFp6RgRLPCioDbx0Bf3wRjyMVFBpVQvTeSLwRRWya50v-pBCC1pd-tOhdeDMoO79SXmR1km1zguel2Ud0zSK05gW5brOMK2jhGVRTYsqmx-599OmEaxBQve9EdoId4Z-ML226EvgQvmhlU7MhpoXCirNEQ7QMTUwKc9wMtrhVLBertaD4lCegUE9PD2hgQ6Zcr6Gg9XfD193QUuDam7Qgm8Svk7WbIGbOC-yIk7SNF-0G77GqORZjGlR1iuW0byIVglWVbymWbzOF2JDI5pEeZTEqyRK4yWrc5oX8aqmWValBSNphB0TcumnXl_Ni-DWJk-idbyQrERppxHebLzQTTk0lqSRFNbZF5gTToZh_2eRk2wHW1bdo_IsWWljsHLy7NmmFArtNJwF8h8ntNB6fmIkdGU9JbxQ8sQYgS2sJ_4Kw1W8GIzcvGpL4dqhXFa6I3QfBvnx56Y3-hdWjtD9SBGE7oOL_w0AAP__1R22iQ">