<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/83402>83402</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] v4f16 -> v4i32 conversion unnecessarily using YMM registers
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:X86,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          RKSimon
      </td>
    </tr>
</table>

<pre>
    https://rust.godbolt.org/z/PEfxzYY4z

```ll
define <4 x i32> @fptosi_4f16_to_4i32(<4 x half> %a) nounwind {
  %cvt = fptosi <4 x half> %a to <4 x i32>
  ret <4 x i32> %cvt
}
```
llc -mcpu=x86-64-v3
```asm
fptosi_4f16_to_4i32:                    # @fptosi_4f16_to_4i32
        vcvtph2ps       %xmm0, %ymm0
 vcvttps2dq      %ymm0, %ymm0
        vzeroupper
 retq
```

We should only require the xmm variants (and avoid the vzeroupper entirely)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx0U89v6ygQ_mvGl1EiPPjnwYe0aS6rSqvdw25PFTEkZheDC9gvyV__5Dht9aI8ZGGY-eYbPpgRIeijVaqB_AnybSLG2Dnf_PXH37p3Ntk7eW66GIcAfAO0A9r5McT10cm9M3Ht_BFodwHa_flyOF3e3rILsC2wzW0u2PIZsxikOmirEPhzhifUnIC_IGTsMEQX9Ht2SIv36N6z2UPVDdYJc7jiKBdANVo32h_aSoTyaaHF2ddOEYFvceHCB8EY3a-pP6O9ivdnuvLdVJTbOznL1pgWV307jMC3p6pYFdlq4ndIEfrF8kgh3-CDAcR_dyW34y5jaqc4dDSEr7j81PcM6Hlenuflgp-BcQgkP76A54fAT-KL8m4cBuVvdq_ix8MbWOZ_FIbOjUais-aMXn2M2iuMncJT3-MkvBY2BgSqhJUoJqfl1fudCJWN2itzBqoT2XBZ81okqklLVhNLa05J1xR1papSVkVa8opkW-eHKpdpnYqMFTk_JLohRhkjqlOWVjmtqazatmxLxltRlIpBxlQvtFkbM_Vz8SY6hFE1Fc8YJUbslQnXTiDai_Z_ZSXwzb9VAURAz0DU6xCUXLkh6l5fRNTOzr58m_hm5lztx2OAjBkdYvjOEnU01xabufItTvOr4moutGl-WGydnZQP2lkcrVWtCkF4bc44Bm2P-Pb6il4ddYjKh2T05q4njzp2437duh5oN2e9_VaDd_-pNgLtrkID0O6q9WcAAAD__zzeJw0">