<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/61183>61183</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Generate better code for std::bit_floor from libstdc++
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            llvm:instcombine,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          kazutakahirata
      </td>
    </tr>
</table>

<pre>
    We should improve the LLVM IR for `std::bit_floor` from libstdc++. Specifically, when we compile `std::bit_floor` from libstdc++, we should generate LLVM IR that is as good as that we would generate for `std::bit_floor` from our own libc++.

```
#include <bit>

unsigned my_bit_floor(unsigned x) {
  return std::bit_floor(x);
}
```

libstdc++
```
$ clang -march=skylake -std=c++20 -O2 -S -emit-llvm bit_floor.cc
```

```
  %cmp.i.i = icmp eq i32 %X, 0
 %shr.i.i = lshr i32 %X, 1
  %0 = tail call i32 @llvm.ctlz.i32(i32 %shr.i.i, i1 false), !range !5
  %sub.i.i = sub nuw nsw i32 32, %0
 %shl.i.i = shl nuw i32 1, %sub.i.i
  %retval.0.i.i = select i1 %cmp.i.i, i32 0, i32 %shl.i.i
  ret i32 %retval.0.i.i
```

libcxx
```
$ clang -march=skylake -std=c++20 -stdlib=libc++ -nostdinc++ \
 -I/usr/lib/llvm-14/include/c++/v1  -O2 -S -emit-llvm bit_floor.cc
```

```
  %cmp.i = icmp eq i32 %X, 0
  %0 = tail call i32 @llvm.ctlz.i32(i32 %X, i1 false), !range !5
  %shl.i = lshr i32 -2147483648, %0
  %cond.i = select i1 %cmp.i, i32 0, i32 %shl.i
  ret i32 %cond.i
```

Here is the value after each LLVM IR instruction:

```
input      0   1   2  0x40000000 0x80000000
--------------------------------------------
libstdc++
shr 0   0   1  0x20000000  0x40000000
ctlz      32  32  31           2 1
sub        0   0   1          30          31
shl        1 1   2  0x40000000  0x80000000
sel        0   1   2  0x40000000 0x80000000

libc++
ctlz      32  31  30           1 0
shr    undef   1   2  0x40000000  0x80000000
sel        0   1   2 0x40000000  0x80000000
```

FWIW, here is the x86 assembly:

libstdc++
```
        89 f8                   mov %edi,%eax        ; 25 bytes, critical path length: 6
        b9 01 00 00 00          mov    $0x1,%ecx
        d1 e8                   shr %eax
        f3 0f bd c0             lzcnt  %eax,%eax
        f6 d8 neg    %al
        85 ff                   test %edi,%edi
        c4 e2 79 f7 c1          shlx   %eax,%ecx,%eax
 0f 44 c7                cmove  %edi,%eax
```

libcxx
```
 f3 0f bd c7             lzcnt  %edi,%eax        ; 17 bytes, critical path length: 3
        b9 00 00 00 80          mov    $0x80000000,%ecx
 c4 e2 7b f7 c1          shrx   %eax,%ecx,%eax
        0f 42 c7 cmovb  %edi,%eax
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0V91u4zYTfRr6ZmCDImX9XPgiiT9_XWCLAl2g2bsFRY0sNpTkklTs5OkLypItOXayaVHCcWRpfs4cHpIjYa3a1ogrsrwny_VMtK5szOpJvLZOPIlSGeHELGvyl9Ujgi2bVuegqp1pnhFcifD16x-_wpffoWgMkIhalxN-R_hdptyPQjeNIRGFwjQVaJVZl0vC7gm7X8C3HUpVKCm0fiHsAfYl1rBHkE21Uxo_E6xzP6HbYo1GuDM0VwoHyoKwsG2a3P_vbu0R9lOPj4toWgPNvvb5h0oIXRN6139HtP8cfzKuaqnbHIHwh0w5wv83Nm_rjv0cqpcf51wsOd0_EJYCie-P5gAGXWtquAKQJd6W8N6UxOvriLrvKXvXkYcgtai3MK-EkSXha_v0osUTwrzLvu69GYX5bwzm32COlXJzrZ8rOMFaSPkOjIubAIQtZbVbqIUCwtegZLUD_AsUZ_7Rdz_Pgy1hS1uak6m2pZnYBaOYtDNxQmnwcjvahdRDXUinXxeKM8KS3r0P64OoAAqhLXpm2QMQFhhRb9FfLEfxbZudgNg2g7rdQ233XR4f-aEDMQauz_al7uy9bdCb9vFGGQy6Z6EX9OyGGqXzAM-cdYg5AzpcnFON9DM8GYd8XyrycPj3GrEu1yojfH1eOTCvG-tyVQ-_yfKhBzr_QtimtYawjfdiGz9X8yAkbNMvKMI2p9W_eQ7gP1Dhhxr8vLa-f0ZVfuqm4p6zIIzDhEdhcqGqDnRT5zflcVMcV6RxDPQOZb-gQb-j-gPgWegWQRQODaCQ5WnbVbV1ppVONbXfrG4zrupd66AbFAACAGAA9BDS4wB6SPrLo8P8E-Pmhuc5peeU9MCGdKPUR1M_j0d8nPV_AZwHGzYbv_bhXMhQTD84HV0PHqUebgVXCn9TuUU9zvAxVedFPK79oqBgCg6CQeGeIwBo6xyLq-l-DuB7Dlf1tXn88uh1Wo50dkgiENZilemXCz19fJwNmJIUigTejqp59sLH3K8TfyEOwyPC74EtIXtxaD0maZTzXQvshCtBY711JeF3EE0zZSnQAPyMdJ9JJh-VhfQQ9MnkYeqbB4DXUPrpOIKb2hccaAFZDpJO7PWrrB0MLkNhF74R5AnUuD2iWgp9wdgSiuIKFofWTSnL1dRThoAM4hSKGGQwrkIf4AKVfAOPFhCGIOPLvLLyneflZP2zA2xEXHyLuBuKCOKPFcHfKmKQQ3JDEae1cSGMnsvsLZfmJ7jsh6eU-VI9h9lHHM7yFc9TnooZroIojtOIBlE6K1dJLjkmUUZ5QdMEachQiowWmOd8GcfxTK0YZZxyGjJKWZgu0iBlSR5HMgp4imFCQoqVUHrRnZKN2c6UtS2uoiBI-EyLDLXtXkgY8xaE3_mzRDZVpmokjHWQWaWsxXze7Jyq1KvozhnG_EuMWXXtQtZurT-KlXX2nMopp3H1_6Hlz9D5k0s2-bH9f9taX3nhmLVGr0rndtbbsg1hm61yZZstZFP13crQtOxM8ydK5zsXX6QlbNPV-XcAAAD___O3hqk">