<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/85419>85419</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed optimization: extractps optimizes better than pshufd+movd
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Alcaro
      </td>
    </tr>
</table>

<pre>
    ```c++
#include <emmintrin.h>

__m128i b;

int square() {
    if (_mm_cvtsi128_si64(b) == 0)
 return 0;
    __m128i c = _mm_shuffle_epi32(b, 238);
    return _mm_cvtsi128_si32(c);
}
```
Compile with -O2, and with -O2 -msse4.1.

Expected result: Either same both times, or a SSE4.1 instruction in the output.

Actual:
```
_Z6squarev: # @_Z6squarev
        xor     ecx, ecx
        cmp qword ptr [rip + b], 0
        pshufd  xmm0, xmmword ptr [rip + b], 238 # xmm0 = mem[2,3,2,3]
        movd    eax, xmm0
        cmove   eax, ecx
        ret
b:
        .zero   16
```
```
_Z6squarev: # @_Z6squarev
        xor     eax, eax
 cmp     qword ptr [rip + b], 0
        je      .LBB0_2
        mov eax, dword ptr [rip + b+8]
.LBB0_2:
        ret
b:
        .zero 16
```
https://godbolt.org/z/4qrGvo617

(testcase reduced from an experiment to check how thoroughly I can disprove assumptions that _mm_load_si128 is atomic)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVMFu4zYQ_Rr6MoghkZIsHXSI4rgo0KKHvfViUOQ44lYUFXLkePP1BWXZa2fTFgVWEEyLM_Pm8Q1nZAjmZUCsWd6wfLuSE3XO14-9kt6tWqe_1axIzq9ivIlvsmXJI-PCDKqfNAITT2itGcibYd0x8bx4zL_7vU15aaBlorndNwNBeJ2kR8ZLxitgm8UOAGAOwHi5t3avjhRMyst9MEXGeNnOvmLLxBYSxqslxiNNfoDkmiWiXHKrGAARLXTT4dDjHkcj-BntCbiIBO4iF7gPBOYQdevLNtvlz0Wk8-eTs6PpEd4MdfDwB49p5KCv3_BgQ8Bsna5vNXk-jagINXgMU09MPMKzoQ49BGkRWkcdkLEYIpzzIOHLl-dsnYIZAvlJkXEDmAGoQ3ATjRPdwT8qmmTPxOOnlPd_Fud6HGNexgWwLLnZvGoTn5Pz84rqFLnE5c6u7Aivb85rGMkDyxtvRmC8gZbl2xiR3PuPsTAa4GRtEs0na_8tmotyZhjd59patCxvosyC8afzmm_vc1h31DNneVpSJB9JuyN-d_jhUB7pvNFeNbyY1u_oHQCkxafa_hSpF1bywipqHJ__o_NXXPj-1jTJnv8g0CWL_hyTN-VV1gvERyn-U6V_0KgjGkMM4TvGdy9Ot66ntfMvjO_eGd9lr_6XoyvSze2VZrwkDKRkQPCoJ4UaDt5ZkAPgaURvLA4E5EB1qP6Czr1BHHBueun6b_ArKDmANmH0sfQyhMmOsYsCUCdp7v_eSb2fJwCYAJKcNXEErHQtdCUqucI63aRJnpYJL1ddnbdVW274QRcHUaDOq0puisMG26pKeS7EytQ84Vki0jzNc5EVa4lpeUiURl6oYpNKliVopenXfX-0UYGVCWHCusyztFr1ssU-zPOa8wHfYDYyzuP49nWMeWinl8CypDeBwncUMtRj_bsJATW4kYw17zKeNt5CPJGXisZwsWCAFonQRymGpUMZb2IbrSbf1x8KZqib2rVylvFdTLksD6N3X1ER47uZaGB8Nx_k7wAAAP__jQHV9Q">