<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/125003>125003</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed optimization: unreachable branch could be pruned after taking into account the possible values of a constexpr lookup table
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          inicula
      </td>
    </tr>
</table>

<pre>
    ## Context

Consider the following code:

```cpp
#include <array>
#include <cstddef>

constexpr size_t MAX_DATA_INDEX = 1024;

constexpr std::array<size_t, 256> GET_DATA_INDEX_V1 = {};

bool foo(unsigned *data, unsigned char first_idx)
{
    size_t second_idx = GET_DATA_INDEX_V1[first_idx];
    if (second_idx >= MAX_DATA_INDEX)
        return false;

    data[second_idx] = true;
 return true;
}

constexpr std::array<size_t, 256> GET_DATA_INDEX_V2 = {0, 1, 2, /* ... */};

bool bar(unsigned *data, unsigned char first_idx)
{
 size_t second_idx = GET_DATA_INDEX_V2[first_idx];
    if (second_idx >= MAX_DATA_INDEX)
        return false;

    data[second_idx] = true;
 return true;
}
```

And [the assembly](https://godbolt.org/z/joPc8vnx8) that gets generated with clang 19.1.0 (and trunk), with `-O3` enabled:

```
foo(unsigned int*, unsigned char):
        mov     dword ptr [rdi], 1
        mov     al, 1
        ret

bar(unsigned int*, unsigned char):
        mov     eax, esi
        lea     rcx, [rip + GET_DATA_INDEX_V2]
        mov     rax, qword ptr [rcx + 8*rax]
        cmp rax, 1023
        ja      .LBB1_2
        mov     dword ptr [rdi + 4*rax], 1
.LBB1_2:
        cmp     rax, 1024
        setb    al
 ret

GET_DATA_INDEX_V2:
        .quad   0
        .quad   1
 .quad   2
        .zero   2024
```

## Discussion

When the lookup table is filled with a single value (e.g. zero), like in `foo()`, clang is able to get rid of the `second_idx >= MAX_DATA_INDEX` branch, because it sees at compile time that the only possible value that `second_idx` can have is zero, which is less than `MAX_DATA_INDEX`.

But in `bar()`, where the lookup table is filled with other values besides 0, clang doesn't see that all possible values of `second_idx` satisfy `second_idx < MAX_DATA_INDEX`, even though this is _also_ derivable at compile time. So the `second_idx >= MAX_DATA_INDEX` branch is also redundant in `bar()`, and ideally that branch would be deleted.

How hard would it be to add an optimization to clang/LLVM such that the `second_idx >= MAX_DATA_INDEX` branch in `bar()` gets pruned?

You could imagine some code that has a bunch of transformation layers through which it passes an initial index (`first_idx -> second_idx -> third_idx -> ... -> last_idx`). And when moving from one layer to the next, it does bounds checking to verify that the current index is less than `MAX_DATA_INDEX`. In this case, it seems like this kind of optimization could get rid of a pretty significant amount of effectively dead code.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzMV0uPG7kR_jXUpeBGiz16HXSQNKNkATsJkMVmcxqwyWo1PWxSS7KlkX99UGTLejnwGr6sYXjcLPKrr756kCNC0DuLuGSTNZs8j0QfW-eX2mrZGzGqnTotGa8Yr2DjbMT3yMoVK1cbZ4NW6CG2CI0zxh213YF0Clm1ynvYtMx_5X5Pn7zSVppeIbBqI7wXJ1a9PBhkiEphM5jKlXQ2RHzfewj6C75G-LT6_fV59evq9Zd_PL_8Dqx6hnHJn1i1fjgQFZGpVoOzTUZgfAN8MmXVC_zt5dcrrNffxgmOzdZs9vwVsHbOQOMc4_PeJrkUML5SIgqC-romW-Gh0T7EV63eGV_Q8RmBAMCZfUDprKINydUDATZZXyAmAwk6rxtgfH5z_IUQbuXIXmH44zH23kIjTMCv4ZAhcZ-sL2hs8pz4RN8PO8-HLyukyU8rzM8Kl7RpnHbSP4xvGV9BURSkLX3dZ6AW_icy8Ofk5399-c9NldFWVgGbrKkLRQjY1eZEtPm8jXEfKDek63bnVO1MLJzfMb79wvj2s_uXnB_s-5zxBcRWRNhhDLBDi15EVHDUsQVphN3BeFGMi5LiF1YRI_tGgfJN3sSm5Yd_VmxaAlpRG1SPE4CVq7sG0jamPN9lj3Cr1ZWGnTukn-rovIJ99BSuVzpFSQX0uFWYe4vHYWzd1dCPkkDxTpsx6CujQZGdyGQkenoPjK-_VV3P30D1GfWPmwjle4KYM74i-81B2e3Ph8Ylr64snzMVKD6u1-NX_id0TF6eLl4G5c4ANzKQ3yvCaeperAFjPcifyzdL_ijCDWbxRy8UAJTfWEspPH9cB1N8Qe9oLTO464nhvnrWQfYhaGfz8n9atOm6Ms699XuIVKugAzTamHPBCwja7gzCQZgeqeSx2BVA_oaSN_oNQVuq-lzStD5N0yx3iw6QkKOjlgKvFbgmOWbT8rsDZFpC7YWVLQHWKEUfEDQNLgwgIkjX7TWh6w5z3xKys-YEexeCrr9yT8YblwQuhYVWHFLgOaoNHFstW1owGAKdS9E9ECuyjus-DvHnbrrEf2zR43cldrFFnzkGqJGeEQGu9FMOg2V8lmLOUQhj7qILpOlDcEFEHZrTg9CbR5VTHx9SRbh-10JsdSCmr8IE9woKvT4k9neaF_Bv9-PJTEVhggOPqrdK2P8nIQ1YrVAYc8qhD-ePrjcKagSFBiOqIRd_d0dohVeDXUfaEh0IpUBYcPuoO_1FRO0sLSeBGd9-_PjbJwi9bC8V9IPhPJDP18fe95bm_zbT-6_rQWZmndhpixBch-mVmD23IoCAuidMahIvbGic7zJjI07oqSB9StFQphH2dNUFClBbHbUwoK1CGphz6srz_Q0f6AFyFVX6jq32V5_04Ej_MWK49SkRiwLoZj3SyOjcgZ62jXcdOIuZFalJsll6EfMNsaK6hdr1VgWQLco3OhUdHNDr5nRRWvbeYyoA4vz9roNfbK5OKQIOvgJiF_IoSqY3bdOUuUl4Vv5qBgnYe4zxBHTX6UZLqkPRud5GMmPToIz6gOYECoVKeSpGalmpRbUQI1yOZ9V8vJhO-XTULvm0qadlM2nkrGqmZSNxNq-lnC8W5XSG5WKkl7zkk3JclWXJq6dpMZkuKinUGGcCZ80Y2VOJndCmMObQ0dtkpEPocTnmk7KsRkbUaEL6zYRzi0dIVsbpGh35JR36UPe7wJ5Ko0MMF5ioo8HlJx0CqhtNWLWC3noUsk3NPZSzPHdXrl8QTaQUi5RCbamhpEwyUQK_MYkEXN7E17Nv1HuzvHuK6dj2dSFdx_iWGA8_Puy9-4wyMr5NcQbGt4MQhyX_XwAAAP__qnwy8A">