[llvm] [LoopVectorize] Add support for vectorisation of simple early exit loops (PR #88385)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Fri May 10 02:35:12 PDT 2024
david-arm wrote:
@sjoerdmeijer @fhahn Here is an example of what happens with this patch when I build the following C code with `-O3 -target aarch64-linux -mcpu=neoverse-v1 -S -mllvm -enable-early-exit-vectorization`:
```
int foo(unsigned char *vec, unsigned char val) {
unsigned char local_vec[128];
for (int i = 0; i < 128; i++) {
local_vec[i] = vec[i] + i;
}
// ACTUAL EARLY EXIT LOOP!
unsigned char *p = &local_vec[0];
for (int i = 0; i < 128; i++) {
if (p[i] == val)
return i;
}
return -1;
}
```
The code looks a bit contrived because this patch will only vectorise if it can prove the loads in the early-exit loop will no fault. Here is the assembly for the vector portion of the loop:
```
.LBB0_16:
rdvl x10, #15
mov x9, xzr
and x0, x10, #0x80
mov z0.b, w1
ptrue p0.b
mov x10, sp
.p2align 5, , 16
.LBB0_17:
ld1b { z1.b }, p0/z, [x10, x9]
cmpeq p1.b, p0/z, z1.b, z0.b
b.ne .LBB0_21
add x9, x9, x8
cmp x0, x9
b.ne .LBB0_17
cbz x0, .LBB0_12
mov w0, #-1
add sp, sp, #128
ret
.LBB0_21:
brkb p0.b, p0/z, p1.b
incp x9, p0.b
mov x0, x9
.LBB0_22:
add sp, sp, #128
ret
```
Thanks!
https://github.com/llvm/llvm-project/pull/88385
More information about the llvm-commits
mailing list