<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=http://email.email.llvm.org/c/eJzNVk1v2zgQ_TXyhYghU_LXwYfUaW8tFrtZYG8BJY0tbilSISk79q_fR0lu_KFsGvRSw7ZIzszjDPn4qMwUh9VX4-VOeKm3LBeOouSeld7XDo2If8F3a4rMKD82doveET-bfU4ej8_FH1H8EMX33f9jScxYuZVaKObJ-d44i_tv2y1oIzUxOWFRGu8o98Y-7cg6aXTEF3IR8XsmlNxqBg8-FWHONRsan7SGhPddHvEli-afummyLKTfNllwqL1FXPLAMulRpe8ReyTmDYzrlL2E8eQzTJex_M3Y_w9VpWsjlRHFldv6OuyUZjD1lZ4h2Y8j8WEkoQ9P2IEAJvOqZm37Nb5Pu4Vtp70JfXK5UMJeLMkJYdIj9JNgcWR6DqDUEz2_zo22TG-Qw9zxKcqSb9lyCu5ZNX8YpJclUTjm94a50ljPOoaxnVANOSZ0wWizwZjckTqwvKT8O9xL4fFHByYsIadGqDFbm4K2pBl-ZAX4fMLKYWAbayom_Q-SXaVxRWwcqbc-EU8GTkJfe_fZ1ZXZHV-y4kfIIuyMk2B8v00vVRVfgbZDYaErqqLppziaPsD5SNbcPDqPybse_F2P5E2Pn6mouKxoclvR5PetyDWhnD7zs205b16FBJFk7_o58tQtAU7ApQln43mQf4-ldB1R8XRUCe0ljhYoD3ZD7RVp3x5OD3N3mnPTaE-2FvYnhLuL-T2EG0gf1OubiAuZbq3r3ukDojwU95YED8hgtx6DyvueBA5t176Ueclqa4omh3ZpgoRZJlw1Zn_rDaSx0RgBIfaEa19r41njiP359SRz4DMrCXooQBEPX1Sor98MpC-bbJybCh2ldqfHHeb9FzDoSuegvGhMk3QyC7VlDarxWDMD6VXyO7UpmEYVbAtZZk2NTMLbSJ-Ipa10SB7y7bGWanxe9Tfk7RpkuQ8aLoOWU6i6RrG1EjmUGhA-HIe79kAESceI1M4j7QyUHo-KVVIsk6UYicbj1lhV4uUOl8BR6FFj1erXal6OSsBzypJ0ls5Flm6SnOdJNpst45QvN9mS00iJjJRbQW8izjXtWQuBNtRn9OsZyBWPOY8nfDaJeTKdj6fTjFMRz6ckFtN0vsGRhkhINQ444UVvZFctZNZsHYwKO-BejcI5UJqoTRgZeulV2_lnMUOf_dVkpvaywougMnuyYTfN5uJKdt09K31_C4c7siNBoGGvSf5Qg7thr3Ctj9qiVm1F_wFHoiba>53419</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] Suboptimal lowering of short vectors equality check: could use scalar types instead
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          max-quazan
      </td>
    </tr>
</table>

<pre>
    Motivating case: https://godbolt.org/z/rbE3TzqdP

The original test
```
define i1 @vector_version(i8* align 1 %arg, i8* align 1 %arg1, i32 %arg2) {
bb:
  %ptr1 = bitcast i8* %arg1 to <4 x i8>*
  %ptr2 = bitcast i8* %arg to <4 x i8>*
  %lhs = load <4 x i8>, <4 x i8>* %ptr1, align 1
  %rhs = load <4 x i8>, <4 x i8>* %ptr2, align 1
  %any_ne = icmp ne <4 x i8> %lhs, %rhs
  %any_ne_scalar = bitcast <4 x i1> %any_ne to i4
  %all_eq = icmp eq i4 %any_ne_scalar, 0
  ret i1 %all_eq
}
```
reads two short vector values and effectively checks that they are equal. Codegen generates vector code from it:
```
vector_version:                         # @vector_version
        vpmovzxbd       (%rsi), %xmm0           # xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
        vpmovzxbd       (%rdi), %xmm1           # xmm1 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
        vpsubd  %xmm1, %xmm0, %xmm0
        vptest  %xmm0, %xmm0
        sete    %al
        retq
```
This code is semantically equivalent to its scalar counterpart
```
define i1 @scalar_version(i8* align 1 %arg, i8* align 1 %arg1, i32 %arg2) {
bb:
  %ptr1 = bitcast i8* %arg1 to i32*
  %ptr2 = bitcast i8* %arg to i32*
  %lhs = load i32, i32* %ptr1, align 1
  %rhs = load i32, i32* %ptr2, align 1
  %all_eq = icmp eq i32 %lhs, %rhs
  ret i1 %all_eq
}

```
which produces neater asm. Unfortunately we cannot use RM vector sub here as stated in https://github.com/llvm/llvm-project/issues/53416, but it looks like we could give up using vector registers at all.

Not sure what is the proper place for this - codegen or instcombine.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzNVk1z2zgM_TXyhROPRFn-OPiQOu2tnZ3d7MzeMpQEW9xSpEJSduxfv4-S3PgrTTN7qce2SAJ4BEDgUbkp98uvxsut8FJvWCEcRek9q7xvHAYR_4LvxpS5UX5s7AazA342_5w-Hp7LP6L4IYrv-__HipixciO1UMyT84NwGg_fblrSWmpiMmHRJN5S4Y192pJ10uiIz-U84vdMKLnRDBo8E2HPFbu1nnSClA9THvEFi2af-m3yPLjfDVlQaLyFXfrAcukRpR8QByTmDYSrCXsJ6-lniM5t-Zu2PzdVlesslRHlhdrq0uzoZhANkZ4g2Y8j8dtIQu-fcAIBTBZ1w7rxq_3gdgfbbXtl-uQKoYQ9S8kRIRkQhk2QHDk5BVDqiZ5f98ZYTq6Qw97x0cqS76rlaDxU1ezhZnlZEqVjfmeYq4z1rK8wthWqJceELhmt11iTW1J7VlRUfId6JTz-aM-EJfjUCjVmK1PShjTDj6xAPR-xCgjY2pqaSf-jyC7cuChstNRbn4inNzphiL3_bJvabA8vefnDZB5OxklU_HBML3UdX4B2SyHRNdVR9imOsgcoH8iaq0evkbyrwd_VSN_U-JWIyvOIkuuIkt83IteGcAbPT47ldHhhEkiSvavnyFOfAnTAuQi98Xyz_h4r6fpCxdNRLbSXaC2UPKobbK9I-645PcR9Nxem1Z5sI-wvEHdv83sQN5A-yNdXFmc03UlXg9IHSPmW3VsUfIMG-3zcZN73KPDWce0qWVSssaZsC3CXJlCYZcLVY_a3XoMaW40VFMSOcO1rbTxrHbE_vx5pDvXMKgIfCpSIhy4i1JdvBtJXbT4uTI2JUtvj4w77_gsYTKVzYF4MsnSSTENseYtoPHJmQL1KfqfOBdOqkm1Ay6xt4El4GxkcsbSRDs6Dvj1yqcanUX-D366Fl7vA4TJwOYWoGwTbKFGAqQHhQzvcdQ0RKB0rUjsPt3OU9HhULtNykS7EyEuvaImu_2c-Rd-zv9rcNF7WeKFRZkc2eGXWZ1eL6-8L6YfbJHB9H0xI59Bbft_gDMKeuJ5GrVXL_5fHxahapovpPJnMs7ycJ7NisZjHSZokWbyeLYjn82ykRE7KhWgizjXtWAeBMSIbySWPOY8TPk1inmazcZblnMp4lpGYZ5PZGk0O2pBqHPwIr34ju-xcytuNg1DhTNyrUDiHIqcueQFftB5JWtbi5Q7pOQg96nZfdt7_B2AmEFo">