<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/105760>105760</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Inconsistent loop unrolling decisions
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            llvm:optimizations
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Kmeakin
      </td>
    </tr>
</table>

<pre>
    https://godbolt.org/z/qdnrje58T

`is_ascii_swar` checks that all the bytes are ascii (ie less than 0x80). It uses a scalar loop (one byte at a time) for the unaligned head/tail, and a swar loop (8 bytes at a time) for the aligned parts.

```rs
pub fn is_ascii_scalar(bytes: &[u8]) -> bool {
    bytes.iter().all(|byte| byte.is_ascii())
}

pub fn is_ascii_swar(bytes: &[u8]) -> bool {
 fn is_ascii_word(word: usize) -> bool {
 word.to_ne_bytes().iter().all(|byte| byte.is_ascii())
    }

    let (head, middle, tail) = unsafe { bytes.align_to::<usize>() };
 is_ascii_scalar(head) && middle.iter().all(|word| is_ascii_word(*word)) && is_ascii_scalar(tail)
}
```


The scalar loop over the head is concise:
```asm
.LBB1_1:
        cbz     x11, .LBB1_4
 ldrsb   w13, [x0], #1
        sub     x11, x11, #1
        tbz     w13, #31, .LBB1_1
        mov     w0, wzr
        ret
.LBB1_4:
 lsl     x11, x12, #3
```

but the scalar loop over the tail gets unrolled:
```asm
.LBB1_8:
        cmp     x9, #0
        cset w0, eq
        cbz     x9, .LBB1_22
        ldrsb   w10, [x8]
 tbnz    w10, #31, .LBB1_22
        cmp     x9, #1
        cset    w0, eq
        b.eq    .LBB1_22
        ldrsb   w10, [x8, #1]
 tbnz    w10, #31, .LBB1_22
        cmp     x9, #2
        cset    w0, eq
        b.eq    .LBB1_22
        ldrsb   w10, [x8, #2]
 tbnz    w10, #31, .LBB1_22
        cmp     x9, #3
        cset    w0, eq
        b.eq    .LBB1_22
        ldrsb   w10, [x8, #3]
 tbnz    w10, #31, .LBB1_22
        cmp     x9, #4
        cset    w0, eq
        b.eq    .LBB1_22
        ldrsb   w10, [x8, #4]
 tbnz    w10, #31, .LBB1_22
        cmp     x9, #5
        cset    w0, eq
        b.eq    .LBB1_22
        ldrsb   w10, [x8, #5]
 tbnz    w10, #31, .LBB1_22
        cmp     x9, #6
        cset    w0, eq
        b.eq    .LBB1_22
        ldrsb   w8, [x8, #6]
        cmp x9, #7
        cset    w9, eq
        cmp     w8, #0
 csel    w0, w9, w0, ge
.LBB1_22:
        ret
```

The decision to unroll one scalar loop but not the other is surprising. IMO they should either both be unrolled, or both not unrolled
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8l1FvozgQxz-N8zIqMgYCPPCQpBupujvdy75HBqbBu46dYtO0-fQnY9IQkt5pT9lWKEaePzO_mXEnhBsjtgqxIMmSJI8z3tlGt8UfO-Q_hZqVun4vGmv3hkQLwtaErbe6LrW0gW63hK2PhK1fatX-wCT7TugjoYvhc06F2XBTCbExB96SOYWqweqnAdtwC1xKsA1C-W7RAG8Rei0QlgkEiabXKaBvGSUsD-DJQmecFEzFJW9Bar13cq28F3BewYodEpbDs257_53i0mVYQ4O8JmxtuZCErYCr2vk6jDxlJ5pbjk5u9ry1Jpik6q_W-I19V8KzgnP-PTBhWe-eRAsgbE6SZZeR5NHFeCDRNyi1lkDSpfcBAJ4mEBbds64IXEp3l66chaSrXhGc4niRuzxW-jimvII6_DLS-PGDbmvCsn6JFtAZccTPnnOiwOqNwo0P57P5v4m50kySc1sSrWuib_MKdqKuJbo73_EcSPQInTL8GR3bUN2-rRur3fl218pnEn3zQftA0SmR6476aHlfPTYfgt7MrK9UurqqIGELf5eP_FwHGpKYtvZ09C7OY__5vcGL_xT9iv4gO2QQBiqtKmHQZX3pi5ud3wn-XC7DTfihgOGvKo_9-haGrr5eFg8aWbemBIBDGDkjSZZvtD9TKyAsCi89ma4cexqWa50dIp6csigahZ6Id_rVi6mTHI7tpblFO04vPqcnjbykYadg_1LtsrN9WW_W2nUNtmgNdKrVUmL9n-XOrsu923usfMChE7tBOySLL590Kj9Xi7FLzblh9NSwfgh4kS3VEUbWy8pPfV2hhjdQP1ozpS0DfHHrL3CeotyLd2r_PbzsbrzRl_BGd-ONv4Q3vhtv8iW8yd1457-BN5vizs-4I44PhvQThvzmhBoyOGST4VYZlGd0_7C_3-J4XjJ2NTA_JvzNge2-GGushBFagdXDZAb3Gjme4W6uK-1nu7YNtu4703TtvhVGqG0AT3_97YzvYBrdyRpQ9KpS2wZKPA98tgI9bDt_p_1ZXUR1HuV8hkWYsjhKE5rQWVPQkuZZHefIoip8xmqe0ZyGVfmch2w-Z9lMFIyymGYsoiGNQxqwKk85phllccajmJOY4o4LGUj5unMv6jNhTIdFSJN0TmeSlyhN_8bPmJOQaKH3VuzEkVuhlSHMDahZWzjjQ9ltDYmpFMaas0crrMTiSVVaGWEsKuvL5tMTavtRYzPrWjn9ESFs05VBpXeErXsEvzzsW_0DK0vYukc2hK0H6teC_RMAAP__Bgtc4w">