<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/98630>98630</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] `@llvm.ceil.f16` is ~6x slower than GCC on Intel Raptor Lake
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          overmighty
      </td>
    </tr>
</table>

<pre>
    https://godbolt.org/z/vc4Y1r6Mq

C++ code:

```cpp
_Float16 foo(_Float16 x) {
    return static_cast<_Float16>(__builtin_ceilf(x));
}
```

GCC output with `-O3 -march=raptorlake -fno-omit-frame-pointer` (takes 1.33-1.64 ns on i7-13700H):

```asm
foo(_Float16):
        vpxor   xmm1, xmm1, xmm1
        vpblendw        xmm0, xmm1, xmm0, 1
        vcvtph2ps xmm0, xmm0
        vroundss        xmm0, xmm0, xmm0, 10
 vinsertps       xmm0, xmm0, xmm0, 0xe
        vcvtps2ph       xmm0, xmm0, 4
        ret
```

Clang output with `-O3 -march=raptorlake -fno-omit-frame-pointer` (takes ~9.12 ns on i7-13700H):

```asm
foo(_Float16):                            # @foo(_Float16)
        push    rbp
        mov     rbp, rsp
 vpextrw eax, xmm0, 0
        vmovd   xmm0, eax
        vcvtph2ps xmm0, xmm0
        vroundss        xmm0, xmm0, xmm0, 10
 vcvtps2ph       xmm0, xmm0, 4
        vmovd   eax, xmm0
        vpinsrw xmm0, xmm0, eax, 0
        pop     rbp
        ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8Vc-P6zQQ_msml1EqZ5LmxyGHty0FJBASJzitnMRtzDq2sZ20j8P721HSLbTZ5SGkFZEVy57vG43n84y59_Kkhahh-wTbfcTH0BtXm0m4QZ768DlqTPe57kOwHtJPQAegw8l0jVFhY9wJ6PAH0GFqs18Tl__4O7A9sE_X_w7oCegJW9OJmXtngpxdR2vtdef5oAwPSY5HY4DKv5YXoAqheLqiEBGdCKPT6AMPsn1uuQ-Q7m54SL-Z2c_NKFWQ-rkVUh2BytnNPNJXR1DsV5Hch_ftbodmDHYMeJahR8hZ_FOK8cBd20O6d9wG4xR_ERgftYnNIEN8dHwQsTVSB-EgZwhUBv4iPCabNI2TTZ6h9mg0yiJO0oKx75aI3k8M98N15zEf9wx8_SZ7MQ4RL8OQAO0e5xWyUUJ359v6MgxsxVjWa1o7BduT9fcEtsI4M-rO-3dcP8zJjTdJ7YUL1v8LgV3EO9F4sv0_ELNHuBPhK0rvFNenj9T6S7VJ6KN0xq98QClCxt7SHk5vR7_kyTX20TCYCW8G2qHzN_tkxSW4Mwp-eZBhJcJgpu4u-TP6f7gz_036W5D3R1nVg9Tend_4eSWs0NZYfDeXb69Y1NVpV6UVj0SdFMSKilUVRX3dUlaUTbHtqqZtqDhmPOciFdSUzZa2RRLJmhhlrEgoSanKqg2rkiJtyrJl4shEwyFjYuBSbZSahrkFR9L7UdRVmacsUrwRyi_dnEiLMy5GIJqbu6tnTtyMJw8ZU9IH_7eXIINanoFfyhy2-7kOZtBsnpvo5pjk80WXHr_kF_TKnIXD0HONS7fU-L0OQuHPS7ngD_xFRKNT65dDhn5sNq0ZgA6z79cpts78JtoAdFgC9kCH64Gmmv4MAAD__zsA19c">