<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/102946>102946</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Inverted shift optimizations should incorporate global offsets
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Validark
      </td>
    </tr>
</table>

<pre>
    Let's say I have this function:

```zig
export fn foo(m: u64) u64 {
    const x = ~m & (m << 1);
    return (@as(u64, 1) << @intCast(@popCount(x) + 1)) - 1;
}
```

LLVM handily optimizes it to the equivalent of this:

```zig
export fn foo(m: u64) u64 {
    const x = ~m & (m << 1);
    return ~(~@as(u64, 0) << @intCast(@popCount(x) + 1));
}
```

However, we can do slightly better by moving the `+1` into `~@as(u64, 0)` by pre-shifting it by 1. Since `~@as(u64, 0) << 1` is `~@as(u64, 1)`, we get:

```zig
export fn bar(m: u64) u64 {
 const x = ~m & (m << 1);
    return ~(~@as(u64, 1) << @intCast(@popCount(x)));
}
```

Here is the assembly version:

```asm
foo:
        lea     rax, [rdi + rdi]
 mov     rcx, -1
        andn    rax, rdi, rax
        popcnt  rax, rax
        inc     al ; we can remove this increment by changing `rcx` to -2
        shlx    rax, rcx, rax
        not     rax
 ret

bar:
        lea     rax, [rdi + rdi]
        mov     rcx, -2
        andn    rax, rdi, rax
        popcnt  rax, rax
 shlx    rax, rcx, rax
        not     rax
 ret
```

This optimization should be applicable across architectures. ([Godbolt link](https://zig.godbolt.org/z/qer96vaqs))
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzMlc9v4ygUx_8acnlKhHHiOAcfJq2yO1L3tKu5Y_xss4PBhWc36aF_-wo7mSZVd7Qd9bBWFAR83w8-PECGoBuLWLDNnm3uF3Kg1vnimzS6kv77onTVqXhAYmIbIMgTfIVWjgjU6gD1YBVpZ1n6hfF7xi__GZ9_z7qZR_DYO09QW6idYyLvWPoFhmzNxC42wLb7WQgAoJwNBEdg6T28dMBEBtEEWHrH0jtImNix9ErvkQZvo4atuQxM5JPnWXmxYmuuLd3JQLOud_2dG2zsHSeZ2M-exQ6WkPwIwLb3bxZ1vdKHh29_QCttpc0JXE-6088YQBOQA2oR8HHQozRoCVw9QftfsHphIn95Q4v_Gq3_SOp394Qj-hjoCUFJC5WDYHTTkjlBiUTooTxB50Ztm4ld9CD2Ccs4aEsu9t_LOc6XJ-g9LkOra4rmmuJQsoI_tVX4b5avlGKI8J4smQOc026QPrB9pfQ_3b5P3LsPVPrHdg09RjJxO2QI2JXmBCP68JNDL0M3j8TyvWjg_BmUU-vlMebNNntf6amcfKXZ5pxLrIJZpibZMrn1Im1lr7xE09jI462sd72y9Cp7O6-tmt0ZYOn-UpceO3e54LRVHrt4essTqFbaJlYXy3hMLOPxlC_FrdPQmuN1bur92NbRDxDzhEe65hmr5xfpnb-3EMXnQvyEhb5Xcn9F7uerVMbHBULrBlNBiSD73mglS4MglXchgPSq1YSKBo9hNb0Bm_1vriqdITDafo9URN4S9dPNKw5MHJ51s2pmzcr5Jo4wcXhEv8tG-RjmE7KoirTapTu5wCLZinSdZ9k2X7RFwrFKVb1J82RT5mqLO5nWvK5QZBUmqVroQnCx5nkikjzZbvKVUvWurqu6ytcbzjFja46d1GZlzNjFDBY6hAGLhIvdOlsYWaIJ04MshMUnmGaZEPF99kU0WpZDE9iaGx0ovLohTQaLr3ZET1jBdB3esAwXmNoq53vnJSE0xpXSgKvrgBQWgzfFLa9GUzuUK-U6Jg4x2LlZ9t79jYqYOEwpBiYO5zWMhfgnAAD__0FsUTs">