<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/68668>68668</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Duplicated branch on equal instructions when used in conjunction with count-trailing-ones
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Validark
      </td>
    </tr>
</table>

<pre>
    On targets where there is no direct count-trailing-zeros instruction, some insert a branch on 0 before running the code that calculates the trailing-zeros count. In my code, I want to commandeer this branch for added efficiency, however, LLVM get's confused when there is a bitwise negation present.

Here is some simplified Zig code which reproduces the problem ([godbolt link here](https://zig.godbolt.org/z/q6oK4rTxo)).

```zig
export fn foo(bitstring: u64, sum: u64) u64 {
    var acc: u64 = sum;
    // We invert, i.e. count 1's, because 0's are shifted in by the bitshift.
    const inverted_bitstring = ~bitstring;

    // Optimization: If @ctz implicitly inserts `if (inverted_bitstring == 0)`, make it more efficient.
    if (inverted_bitstring == 0) {
        acc += 64;
        return acc;
    }

    const str_len: u8 = @ctz(inverted_bitstring);
    acc += str_len;
    return acc;
}
```

On x86_64 (default target) we get:
```asm
foo:
        cmp     rdi, -1
        je      .LBB0_1
        xor     rdi, -1
        je      .LBB0_3
        rep       bsf rax, rdi
        add     rax, rsi
        ret
.LBB0_1:
        add     rsi, 64
        mov     rax, rsi
        ret
.LBB0_3:
        mov     eax, 64
        add     rax, rsi
        ret
```

On riscv64 (default target) we get the same branch-on-equal instruction twice.

```asm
foo:
        li      a2, -1
        beq     a0, a2, .LBB0_3
        beq     a0, a2, .LBB0_4
        ...
        ...
.LBB0_3:
        addi    a0, a1, 64
        ret
.LBB0_4:
        li      a0, 64
        add     a0, a0, a1
        ret
```

Also does the same thing on s390x, sparc64, and ve (default targets). If you would like to see some less-simplified code where I actually tried to use this idea, [here is a godbolt link to a less-trimmed version of my actual code](https://zig.godbolt.org/z/d8v8Gcj13) (go to line 461 in the source code).

Thank you to all LLVM contributors!

‒ Validark
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyUVsFu4zYQ_Rr6MrAhU7IiH3yIm6YNusVeFlugl4AiRxITivSSlJ3k0G8vhlIS20kWWSOwHA75Zua9EWdECLq1iBu22rLV1UwMsXN-810YrYS_n9VOPW6-WojCtxgDHDr0CDF96wDWgdIeZQTpBhvn0QtttG3nT-hdAG1D9IOM2lnGf4PgeqQ19BEE1F5Y2YGzkEGNjfMIfrBW25bwQTpFjkQEKYwcjIgYkuHMR3K8gBsL_WM6RJ5u4CBshOhAur4XViF6iJ0Oz14b50EohQqwabTUaOUjHezcAffo6eeXL9__hhYj4xfkxTZDQEUE2Nf8BdQ6HnRAsNgKyhN2HgPauGDZFcsux-8_p-2JgKD7ndGNRgX_6nbM89Bp2YHHnXdqkFOiO-9qgz0wXrHVtnWqdiaC0fYeyD9bXTFedTHuAssvGb9m_PpJt4tp48L5llYYv_5Rur8K_-3BMb5mfH0SGiuz8e9Jt-MKPuycj9BYaJxjvKp1DNFr27L8EoaySEoO_ct_a3oAu9iOxwEA9sKDkHLaAiy_Gk8cbRnjhX-oIPboI6HqBS5GPWFJrNNajVIMASFLMgiPEDrdRFSgLdSPiSiKkBYXr_DS2RAnaFS3LzmkWP47Sml7TMZRYF93Uff6KWlKedw0wIpMxidI8kkdzeNUywFYmemGdHrfIfnMiPsyo5R6cY-gI_RU8s_ldxz8Z7BOCaePkBIY39KGsjjhmj4e4-DtqMqxDBdX5wSM1IXobw2m1IcqsTam_25glNsx6lEoLzhH5ndieY3juR6Pw_pq4aEqb6mUeKWwEYOJ05VEVBwwvaf5eUmL0I8rVMj55Skhst-NxChNosyXp-Y7HJ-LL9ttdntmfHD-82fzcyF20686NODFAyEQ0KmYSo27J3vQb-QcF57jO0_vBSCkEMvi1Ny7_a_g52_wnwFwBDjH_3T8H-ntdZD7nwue3v0gepwu9bmzc_wxCHPcdyAetMT3r7yf1ofRUyb8PYlr_DFa0xs97nlX7o83njG2WCw-WPhIAqGUPoJevqfDmY7Fx2lmP5NxcvHs6BekvDTBgXJTS0tqxY7uMmch5OsslUfYCS_HziKsgj2-lT1Q46Jb-NENcHCDUWD0PVKLD4hjZzUYwvyovU6tlZrvDQgZB2HMI0RPtuiA2kqaCbRCQb7Zavva2E_6bXQgRvjodd8jxegDlZdraO4Ywcfx49NtWVX76g95t8zTdc6r1pEfoy1CUS6pvyXK3OAlTpPNae_-1gl7nwih-IwZRxbpbPS6HqLzgfHlSeX_zlmVsTWHlwFPbXK1ztdihptluS7XPM-zYtZt5CqreFk3spaqqjHjGa75WnCRq6op8ouZ3vCM58tsmWUX-SorFsVyecHXaylXWcElKlZk2AttFsbse0p7pkMYcFNWZVnNjKjRhDR2cm7xAMnIOKcp1G_ozLwe2sCKzOgQwytK1NHg5mqgLixoEHgdJN-8_WGc19Lkpi1RczfY8V446NidT63OYpgN3mxO5Wt17IZ6IV3P-DXFMT3mO-_uUEbGr1P0gfHrlN3_AQAA___nf04w">