<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/98630>98630</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[X86] `@llvm.ceil.f16` is ~6x slower than GCC on Intel Raptor Lake
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
overmighty
</td>
</tr>
</table>
<pre>
https://godbolt.org/z/vc4Y1r6Mq
C++ code:
```cpp
_Float16 foo(_Float16 x) {
return static_cast<_Float16>(__builtin_ceilf(x));
}
```
GCC output with `-O3 -march=raptorlake -fno-omit-frame-pointer` (takes 1.33-1.64 ns on i7-13700H):
```asm
foo(_Float16):
vpxor xmm1, xmm1, xmm1
vpblendw xmm0, xmm1, xmm0, 1
vcvtph2ps xmm0, xmm0
vroundss xmm0, xmm0, xmm0, 10
vinsertps xmm0, xmm0, xmm0, 0xe
vcvtps2ph xmm0, xmm0, 4
ret
```
Clang output with `-O3 -march=raptorlake -fno-omit-frame-pointer` (takes ~9.12 ns on i7-13700H):
```asm
foo(_Float16): # @foo(_Float16)
push rbp
mov rbp, rsp
vpextrw eax, xmm0, 0
vmovd xmm0, eax
vcvtph2ps xmm0, xmm0
vroundss xmm0, xmm0, xmm0, 10
vcvtps2ph xmm0, xmm0, 4
vmovd eax, xmm0
vpinsrw xmm0, xmm0, eax, 0
pop rbp
ret
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8Vc-P6zQQ_msml1EqZ5LmxyGHty0FJBASJzitnMRtzDq2sZ20j8P721HSLbTZ5SGkFZEVy57vG43n84y59_Kkhahh-wTbfcTH0BtXm0m4QZ768DlqTPe57kOwHtJPQAegw8l0jVFhY9wJ6PAH0GFqs18Tl__4O7A9sE_X_w7oCegJW9OJmXtngpxdR2vtdef5oAwPSY5HY4DKv5YXoAqheLqiEBGdCKPT6AMPsn1uuQ-Q7m54SL-Z2c_NKFWQ-rkVUh2BytnNPNJXR1DsV5Hch_ftbodmDHYMeJahR8hZ_FOK8cBd20O6d9wG4xR_ERgftYnNIEN8dHwQsTVSB-EgZwhUBv4iPCabNI2TTZ6h9mg0yiJO0oKx75aI3k8M98N15zEf9wx8_SZ7MQ4RL8OQAO0e5xWyUUJ359v6MgxsxVjWa1o7BduT9fcEtsI4M-rO-3dcP8zJjTdJ7YUL1v8LgV3EO9F4sv0_ELNHuBPhK0rvFNenj9T6S7VJ6KN0xq98QClCxt7SHk5vR7_kyTX20TCYCW8G2qHzN_tkxSW4Mwp-eZBhJcJgpu4u-TP6f7gz_036W5D3R1nVg9Tend_4eSWs0NZYfDeXb69Y1NVpV6UVj0SdFMSKilUVRX3dUlaUTbHtqqZtqDhmPOciFdSUzZa2RRLJmhhlrEgoSanKqg2rkiJtyrJl4shEwyFjYuBSbZSahrkFR9L7UdRVmacsUrwRyi_dnEiLMy5GIJqbu6tnTtyMJw8ZU9IH_7eXIINanoFfyhy2-7kOZtBsnpvo5pjk80WXHr_kF_TKnIXD0HONS7fU-L0OQuHPS7ngD_xFRKNT65dDhn5sNq0ZgA6z79cpts78JtoAdFgC9kCH64Gmmv4MAAD__zsA19c">