<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/155270>155270</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] bf16 to float conversion fails to stay on FPU
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:X86,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          RKSimon
      </td>
    </tr>
</table>

<pre>
    https://godbolt.org/z/qTasEq77v

```ll
define float @src(bfloat %a0) {
  %res = fpext bfloat %a0 to float
  ret float %res
}
```
```asm
src: # @src
  pextrw $0, %xmm0, %eax
  shll $16, %eax
  movd %eax, %xmm0
  retq
```
This could be handled as:
```asm
src: # @src
  pmovzxwd %xmm0, %xmm0 ; UNPCKLWD pre-SSE41
 pslld $16, %xmm0
  retq
```
Hopefully we can handle this generically instead of handling it in explicit fpext lowering
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyMU8tu4zgQ_JrWpWGDoh6MDjo4cYQFslgEmwS7V0ps2ZyhSIWkH8nXD-g4GY8PgwEIqKkqVlVLTRmC3liiFqpbqNaZ3MWt8-2_D096cjbrnXprtzHOAYoV8A54t3GqdyYund8A796Bd6_PMty_CrEHtkqrZh_LGGArRaO2hKNxMiKULPgB-E1_3vNKMuANgrgFtsL0wlNAKNY4znSMeEnE6D50TlRPEb9ATyE5i_Wl_2UtwwRslcyLFQIvPqMkpWTkDwi8ZMDvktxxmj5LkscTKWyNSZS8vgImt1ef-4vT54ivV4Getzrg4HZGYU-4lVYZUihPn_dP405u_348qKugqUQobvHln8e7h7__W-PsafH0dF_m6dgcjFG_NPC7mH-5mcadMW94IBykPSfFmOJvyJLXg0ywtiGSVOjGD4q2G9QRtUU6zkYPOp5_pHEH8tpuMtUWqikamVGbi6rOi0YwkW1bPpaqF1RQ3RdlXuWDkHWTD0VeqYbVQmS65YxX7IZXeV1UrF4OQyOEymmoG1FSMULJaJLaLI3ZT2k-Mx3Cjtq8qrhgmZE9mXAadM57OXwnq6BY_X9TA-fA74DzSYdAauHmqCf9LqN2NmHVOvNtEl30u02AkhkdYvhpE3U0pxuUtKo19mNefw0rDs7uyQftLI5Sm5CQEOUbOovd40u28-b6ium43fXLwU3Au-Ryfixm777REIF3p84C8O7c3L7lPwIAAP__0uIhtw">