<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/60802>60802</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Improve the assembly sequence for std::bit_ceil
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:X86,
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
kazutakahirata
</td>
</tr>
</table>
<pre>
Compile:
```
// clang -std=c++20 -march=skylake -O2
#include <bit>
unsigned bit_ceil(unsigned X) {
return std::bit_ceil(X);
}
```
I get:
```
%sub.i.i = add i32 %X, -1
%0 = tail call i32 @llvm.ctlz.i32(i32 %sub.i.i, i1 false), !range !5
%sub2.i.i = sub nuw nsw i32 32, %0
%shl.i.i = shl nuw i32 1, %sub2.i.i
%or.cond.inv.i.i = icmp ugt i32 %X, 1
%retval.0.i.i = select i1 %or.cond.inv.i.i, i32 %shl.i.i, i32 1
8d 47 ff lea -0x1(%rdi),%eax
f3 0f bd c0 lzcnt %eax,%eax
f6 d8 neg %al
b9 01 00 00 00 mov $0x1,%ecx
c4 e2 79 f7 c1 shlx %eax,%ecx,%eax
83 ff 02 cmp $0x2,%edi
0f 42 c1 cmovb %ecx,%eax
```
We could drop `cmp` and `cmovb` like so:
```
8d 47 ff lea -0x1(%rdi),%eax
f3 0f bd c0 lzcnt %eax,%eax
f6 d8 neg %al
b9 01 00 00 00 mov $0x1,%ecx
c4 e2 79 f7 c1 shlx %eax,%ecx,%eax
```
This shorter sequence handles input 0 and 1 correctly:
```
input 0 1 2
-------------------
lea -1 0 1
lzcnt 0 32 31
neg 0 -32 -31
&0x1f 0 0 1
shlx 1 1 2
```
Note that `shlx` masks the shift count with `0x1f`, which is something that the LLVM IR doesn't know, so this issue probably belongs to the x86 backend.
I've empirically verified the equivalence for all possible values of `uint32_t`.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0Vl-Po7YX_TTOy1UiYwKEhzzMJL-RVtpfK1VVO28rY1-CG2Oz2DCZ-fSVDclkNtFO-1ALkfj6nvvnHIPhzqmDQdyS7JFk-wUffGP77ZG_DZ4feaN67vmisvJ1u7NtpzSS9IHQPaHne07na5qyJ8KeQGhuDrB0XpJ0Lwh7JOyRUVi2vBcNSffu-Kr5EWH5KzvjUmWEHiQCSXeV8iT933WawcQ6JVTKfxOoNGGbi-2ZsBJI8Ti5AvToh95AzP5A0ocrTHAl6exJiv39JuL9CxzQ_7xbAMIyN1QrtVJA0j1wKUGlLJifCdvBMrlypNHFc6VBcK0nxzXVemxXwuu3lUoZYZsZP4cNUVQCNdcOQ-1sB4QlPTcHDH-yj4WwSyVuqMAML2DcS0wUQu9iFdeIRr8DGh0BwTmZfc8RryC2Xwlr5EqZ8QJVou1gOPgPrV933qMfuV7R92SoUfjQ2G3I2PHMwVTf2ZJcSwGwkbAuoK7hdmjksKSnhLBNSC_VRB1hGfLTGV-nQGuoJAh6g38TxsfSg_8NMAe5AYOH4EpYxvV5pSqBJkDpfF2P1o5A2DoWFeOJSzyxBmRQlFAXIJIPKNfoE_xQiLitaJMGHiiDoET0X9MTm93kRT9aw5r9mGIaorVjBXfD3306_kQQdtASZG87IDkVbUdyCtzIaWbHKsy1OiI4-9lz9ImWAP-5nHdH0PjfCHym_lON_5Gud4n_vVEOXGN7jz04_D6gEQgNN1KjA2W6wcMkQwLC9j0Kr19_Tv8EAoj9JDC_kpe3Y1o4CxK30YSZFyaaoy1l4TYvBBopwDIYl2cjYTk9JfU5xnWgmR1IPhR0l45frEfwDfdh1wVc2HQtd0cHvkFwjap92KnGw4vyTfAKWUMUtoOXRokGAqG2Rd8oc5hiBejXr3_8H778BtKiM4QVHo7GvgSUs-CDCsq5AaHrbcUr_QoVamsODryN-NMmh4qLIxq5-nCuEFaMCNh2qlfhIHiFEXtVK5QRh98HNXIdZa1tD-Go6KxzqtIII9cDOrB1aGRQxqfsmyc5nTMs5DaVZVryBW6TvMiztKTrYtFsi4zSpCqxWBcZzbNSimKNrMySOg_TYqG2jLKUsiSnZVKwYlWWCUvqUtabmgsh1mRNseVKr-J5ZfvDIra_zemGsoXmFWoXvyIYm7sm6cPzJidseg-xVjmHcmk7r1r1xr2yJqxl-0W_DTGX1XBw4TxUzrv3LF55jdsvbdfbESNB3DlsA-GX3R9oujntF0Ovt433nQv2-F1yUL4ZqpWwLWFPIcP8s-x6-xcKT9hTbMoR9hT7-jsAAP__AmZzzg">