<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/148041>148041</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Pessimising rewrite in `contains_zero_byte`
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            llvm:instcombine,
            missed-optimization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Kmeakin
      </td>
    </tr>
</table>

<pre>
    Consider the following Rust code for determining if any of the 8 bytes in a u64 are zero, taken from [the Rust standard library's implementation of `memchr`](https://github.com/rust-lang/rust/blob/master/library/core/src/slice/memchr.rs):
```rust
const LO: u64 = 0x01_01_01_01_01_01_01_01;
const HI: u64 = 0x80_80_80_80_80_80_80_80;

const fn contains_zero_byte(x: u64) -> bool {
 x.wrapping_sub(LO_USIZE) & !x & HI_USIZE != 0
}
```

The equivalent C++ code is:
https://godbolt.org/z/Ej8TT8v84
```c++
constexpr u64 LO = 0x01'01'01'01'01'01'01'01;
constexpr u64 HI = 0x80'80'80'80'80'80'80'80;

bool contains_zero_byte(u64 x) { 
    return ((x - LO) & ~x & HI) != 0;
}
```

For this function, GCC generates
```asm
;; AArch64:
contains_zero(unsigned long):
        mov x1, -72340172838076674
        movk    x1, 0xfeff, lsl 0
        add x1, x0, x1
        bic     x1, x1, x0
        tst     x1, -9187201950435737472
        cset    w0, ne
        ret

;; x86_64:
contains_zero_byte(unsigned long):
        movabs  rax, -72340172838076673
        add     rax, rdi
        andn    rdi, rdi, rax
 movabs  rax, -9187201950435737472
        test    rdi, rax
 setne   al
        ret
```

but LLVM generates:
```asm
;; AArch64:
contains_zero_byte(unsigned long):
        mov     x8, #72340172838076673
        mov     x9, #-9187201950435737472
        movk x8, #256
        sub     x8, x8, x0
        orr     x8, x8, x0
 bics    xzr, x9, x8
        cset    w0, ne
        ret

;; x86_64:
contains_zero_byte(unsigned long):
        movabs  rax, 72340172838076672
        sub     rax, rdi
        or      rax, rdi
 movabs  rcx, -9187201950435737472
        andn    rax, rax, rcx
 setne   al
        ret
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzMVs9zqzYQ_mvki8YeWH4fONjOc1-m6aTTpj304hGw2HoByZVEQnLo396RIAQnbpJjPYyEtd_uar_9ADGt-UEg5iTakOhqwTpzlCr_uUV2z8WikNVTvpVC8woVNUektWwa-cjFgf7WaUNLWdk1RSs0qFourIXXlIknKmvnkdLiyaCmXFBGuzikTCF9RiUJbKlh9yhorWRLSbSxcBdWGyYqpira8EIx9UQg0ZS3pwZbFIYZLoUNT2KvxbY8KhJ7JLoikB6NOWkSrAnsCOwO3By7YlXKlsBOddosGyYO4z2BXdHIgsCuZdqgIrCbsu1KqZDATqvSjg0v7b8h10ppApnN4a1tXne5gN66lEIbenNLgrUrlQRX1Os9f3_pIsFmcvl-fe6SevtL1-AyedWCllIYxoXeW0b3lmkCaT8GI5DRJQm-0ULKhpLEOtN-9ajY6cTFYa-7gkB6c7v_4_frv75ZNIGYEvB7d_P9ejDYFbctmzu5mtc97ObuiBT_7vgDa1AYuiWwIbAZxMH1QNWbzsiqkI1ZSWXb8Uxg9-1HeneXPqThPHw5RHopGPuTciTd3E7UEkg-HYL3Eb5fT0wTSD4dXnh3RF7k3AbtHYXJhlqeKaUKTacEJZDaptClVcZI8j8vFA8LI79DmksU76R9_rimdSdKq3_79Py03dIDClTMoJ77MN3av8GGBBu6XqvyGIdDG872brct3Augoo20T8aoazr-WvlAe9-mWiYQhJ6fQBqkXhLHSXgOu7fzAPX6Guva3jW6caJ5wbGqGjG950Z_Zix4SV-DTLBXgNFmBlhmfpqA52eRFwZREiRhAjNwqdGhH10igTOTQjNwOvLTp_H-Ij1Taz_jiBWaUsX6i0QFbwhwWxiwquJzo6iEM1b8xWgn1lvM2ywfV29w4Oo8hkYj0CZq3rNxrraiM_Tm5s9fZuo6f999UWBfZnBobGo3SyD4iMIJnI3gT6hw2pwiQxTPbLorZonHcS45qdR_AApeamd5Vm4xGwH_AwW-ZQ8uVHxRgFLRC8YpevkV5U0iHmOMU_llAS6qPKiyIGMLzP0kAj-JAi9bHPMwrVgVVrHvx3FYY1ykaV2VYVxhXAZJVC94Dh5EXuJ7fgZ-mK3Qz-oUAMMMYy-LkIQetow3q6Z5aO2nZ8G17jD3w9QL_UXDCmy0OwkBWAgJ1lxoU8q24AIJAIEtAWi51lgt5cnwlj-z4WUM9vSkcuu2LLqDJqHXcG30ay7DTYP5r6g1b7m25ySFj4obtCcj-6173_jYW3SqyT841LhtDtPypOQPLO25xpWlCezGyh5y-DcAAP__2lu-1g">