<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/60802>60802</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Improve the assembly sequence for std::bit_ceil
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:X86,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          kazutakahirata
      </td>
    </tr>
</table>

<pre>
    Compile:

```
// clang -std=c++20 -march=skylake -O2
#include <bit>

unsigned bit_ceil(unsigned X) {
  return std::bit_ceil(X);
}
```

I get:

```
  %sub.i.i = add i32 %X, -1
  %0 = tail call i32 @llvm.ctlz.i32(i32 %sub.i.i, i1 false), !range !5
  %sub2.i.i = sub nuw nsw i32 32, %0
  %shl.i.i = shl nuw i32 1, %sub2.i.i
  %or.cond.inv.i.i = icmp ugt i32 %X, 1
  %retval.0.i.i = select i1 %or.cond.inv.i.i, i32 %shl.i.i, i32 1

  8d 47 ff                   lea -0x1(%rdi),%eax
  f3 0f bd c0                lzcnt  %eax,%eax
  f6 d8 neg    %al
  b9 01 00 00 00             mov $0x1,%ecx
  c4 e2 79 f7 c1             shlx   %eax,%ecx,%eax
  83 ff 02 cmp    $0x2,%edi
  0f 42 c1                   cmovb %ecx,%eax
```

We could drop `cmp` and `cmovb` like so:

```
  8d 47 ff                   lea    -0x1(%rdi),%eax
  f3 0f bd c0                lzcnt  %eax,%eax
  f6 d8                      neg %al
  b9 01 00 00 00             mov    $0x1,%ecx
  c4 e2 79 f7 c1 shlx   %eax,%ecx,%eax
```

This shorter sequence handles input 0 and 1 correctly:

```
input   0    1 2
-------------------
lea    -1    0    1
lzcnt   0   32   31
neg 0  -32  -31
&0x1f   0    0    1
shlx    1    1 2
```

Note that `shlx` masks the shift count with `0x1f`, which is something that the LLVM IR doesn't know, so this issue probably belongs to the x86 backend.

I've empirically verified the equivalence for all possible values of `uint32_t`.

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0Vl-Po7YX_TTOy1UiYwKEhzzMJL-RVtpfK1VVO28rY1-CG2Oz2DCZ-fSVDclkNtFO-1ALkfj6nvvnHIPhzqmDQdyS7JFk-wUffGP77ZG_DZ4feaN67vmisvJ1u7NtpzSS9IHQPaHne07na5qyJ8KeQGhuDrB0XpJ0Lwh7JOyRUVi2vBcNSffu-Kr5EWH5KzvjUmWEHiQCSXeV8iT933WawcQ6JVTKfxOoNGGbi-2ZsBJI8Ti5AvToh95AzP5A0ocrTHAl6exJiv39JuL9CxzQ_7xbAMIyN1QrtVJA0j1wKUGlLJifCdvBMrlypNHFc6VBcK0nxzXVemxXwuu3lUoZYZsZP4cNUVQCNdcOQ-1sB4QlPTcHDH-yj4WwSyVuqMAML2DcS0wUQu9iFdeIRr8DGh0BwTmZfc8RryC2Xwlr5EqZ8QJVou1gOPgPrV933qMfuV7R92SoUfjQ2G3I2PHMwVTf2ZJcSwGwkbAuoK7hdmjksKSnhLBNSC_VRB1hGfLTGV-nQGuoJAh6g38TxsfSg_8NMAe5AYOH4EpYxvV5pSqBJkDpfF2P1o5A2DoWFeOJSzyxBmRQlFAXIJIPKNfoE_xQiLitaJMGHiiDoET0X9MTm93kRT9aw5r9mGIaorVjBXfD3306_kQQdtASZG87IDkVbUdyCtzIaWbHKsy1OiI4-9lz9ImWAP-5nHdH0PjfCHym_lON_5Gud4n_vVEOXGN7jz04_D6gEQgNN1KjA2W6wcMkQwLC9j0Kr19_Tv8EAoj9JDC_kpe3Y1o4CxK30YSZFyaaoy1l4TYvBBopwDIYl2cjYTk9JfU5xnWgmR1IPhR0l45frEfwDfdh1wVc2HQtd0cHvkFwjap92KnGw4vyTfAKWUMUtoOXRokGAqG2Rd8oc5hiBejXr3_8H778BtKiM4QVHo7GvgSUs-CDCsq5AaHrbcUr_QoVamsODryN-NMmh4qLIxq5-nCuEFaMCNh2qlfhIHiFEXtVK5QRh98HNXIdZa1tD-Go6KxzqtIII9cDOrB1aGRQxqfsmyc5nTMs5DaVZVryBW6TvMiztKTrYtFsi4zSpCqxWBcZzbNSimKNrMySOg_TYqG2jLKUsiSnZVKwYlWWCUvqUtabmgsh1mRNseVKr-J5ZfvDIra_zemGsoXmFWoXvyIYm7sm6cPzJidseg-xVjmHcmk7r1r1xr2yJqxl-0W_DTGX1XBw4TxUzrv3LF55jdsvbdfbESNB3DlsA-GX3R9oujntF0Ovt433nQv2-F1yUL4ZqpWwLWFPIcP8s-x6-xcKT9hTbMoR9hT7-jsAAP__AmZzzg">