<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/105760>105760</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Inconsistent loop unrolling decisions
</td>
</tr>
<tr>
<th>Labels</th>
<td>
llvm:optimizations
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Kmeakin
</td>
</tr>
</table>
<pre>
https://godbolt.org/z/qdnrje58T
`is_ascii_swar` checks that all the bytes are ascii (ie less than 0x80). It uses a scalar loop (one byte at a time) for the unaligned head/tail, and a swar loop (8 bytes at a time) for the aligned parts.
```rs
pub fn is_ascii_scalar(bytes: &[u8]) -> bool {
bytes.iter().all(|byte| byte.is_ascii())
}
pub fn is_ascii_swar(bytes: &[u8]) -> bool {
fn is_ascii_word(word: usize) -> bool {
word.to_ne_bytes().iter().all(|byte| byte.is_ascii())
}
let (head, middle, tail) = unsafe { bytes.align_to::<usize>() };
is_ascii_scalar(head) && middle.iter().all(|word| is_ascii_word(*word)) && is_ascii_scalar(tail)
}
```
The scalar loop over the head is concise:
```asm
.LBB1_1:
cbz x11, .LBB1_4
ldrsb w13, [x0], #1
sub x11, x11, #1
tbz w13, #31, .LBB1_1
mov w0, wzr
ret
.LBB1_4:
lsl x11, x12, #3
```
but the scalar loop over the tail gets unrolled:
```asm
.LBB1_8:
cmp x9, #0
cset w0, eq
cbz x9, .LBB1_22
ldrsb w10, [x8]
tbnz w10, #31, .LBB1_22
cmp x9, #1
cset w0, eq
b.eq .LBB1_22
ldrsb w10, [x8, #1]
tbnz w10, #31, .LBB1_22
cmp x9, #2
cset w0, eq
b.eq .LBB1_22
ldrsb w10, [x8, #2]
tbnz w10, #31, .LBB1_22
cmp x9, #3
cset w0, eq
b.eq .LBB1_22
ldrsb w10, [x8, #3]
tbnz w10, #31, .LBB1_22
cmp x9, #4
cset w0, eq
b.eq .LBB1_22
ldrsb w10, [x8, #4]
tbnz w10, #31, .LBB1_22
cmp x9, #5
cset w0, eq
b.eq .LBB1_22
ldrsb w10, [x8, #5]
tbnz w10, #31, .LBB1_22
cmp x9, #6
cset w0, eq
b.eq .LBB1_22
ldrsb w8, [x8, #6]
cmp x9, #7
cset w9, eq
cmp w8, #0
csel w0, w9, w0, ge
.LBB1_22:
ret
```
The decision to unroll one scalar loop but not the other is surprising. IMO they should either both be unrolled, or both not unrolled
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8l1FvozgQxz-N8zIqMgYCPPCQpBupujvdy75HBqbBu46dYtO0-fQnY9IQkt5pT9lWKEaePzO_mXEnhBsjtgqxIMmSJI8z3tlGt8UfO-Q_hZqVun4vGmv3hkQLwtaErbe6LrW0gW63hK2PhK1fatX-wCT7TugjoYvhc06F2XBTCbExB96SOYWqweqnAdtwC1xKsA1C-W7RAG8Rei0QlgkEiabXKaBvGSUsD-DJQmecFEzFJW9Bar13cq28F3BewYodEpbDs257_53i0mVYQ4O8JmxtuZCErYCr2vk6jDxlJ5pbjk5u9ry1Jpik6q_W-I19V8KzgnP-PTBhWe-eRAsgbE6SZZeR5NHFeCDRNyi1lkDSpfcBAJ4mEBbds64IXEp3l66chaSrXhGc4niRuzxW-jimvII6_DLS-PGDbmvCsn6JFtAZccTPnnOiwOqNwo0P57P5v4m50kySc1sSrWuib_MKdqKuJbo73_EcSPQInTL8GR3bUN2-rRur3fl218pnEn3zQftA0SmR6476aHlfPTYfgt7MrK9UurqqIGELf5eP_FwHGpKYtvZ09C7OY__5vcGL_xT9iv4gO2QQBiqtKmHQZX3pi5ud3wn-XC7DTfihgOGvKo_9-haGrr5eFg8aWbemBIBDGDkjSZZvtD9TKyAsCi89ma4cexqWa50dIp6csigahZ6Id_rVi6mTHI7tpblFO04vPqcnjbykYadg_1LtsrN9WW_W2nUNtmgNdKrVUmL9n-XOrsu923usfMChE7tBOySLL590Kj9Xi7FLzblh9NSwfgh4kS3VEUbWy8pPfV2hhjdQP1ozpS0DfHHrL3CeotyLd2r_PbzsbrzRl_BGd-ONv4Q3vhtv8iW8yd1457-BN5vizs-4I44PhvQThvzmhBoyOGST4VYZlGd0_7C_3-J4XjJ2NTA_JvzNge2-GGushBFagdXDZAb3Gjme4W6uK-1nu7YNtu4703TtvhVGqG0AT3_97YzvYBrdyRpQ9KpS2wZKPA98tgI9bDt_p_1ZXUR1HuV8hkWYsjhKE5rQWVPQkuZZHefIoip8xmqe0ZyGVfmch2w-Z9lMFIyymGYsoiGNQxqwKk85phllccajmJOY4o4LGUj5unMv6jNhTIdFSJN0TmeSlyhN_8bPmJOQaKH3VuzEkVuhlSHMDahZWzjjQ9ltDYmpFMaas0crrMTiSVVaGWEsKuvL5tMTavtRYzPrWjn9ESFs05VBpXeErXsEvzzsW_0DK0vYukc2hK0H6teC_RMAAP__Bgtc4w">