[llvm] [LoopVectorize] Add support for vectorisation of simple early exit loops (PR #88385)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Fri May 10 02:35:12 PDT 2024


david-arm wrote:

@sjoerdmeijer @fhahn Here is an example of what happens with this patch when I build the following C code with `-O3 -target aarch64-linux -mcpu=neoverse-v1 -S -mllvm -enable-early-exit-vectorization`:

```
int foo(unsigned char *vec, unsigned char val) {
  unsigned char local_vec[128];

  for (int i = 0; i < 128; i++) {
    local_vec[i] = vec[i] + i;
  }

  // ACTUAL EARLY EXIT LOOP!
  unsigned char *p = &local_vec[0];
  for (int i = 0; i < 128; i++) {
    if (p[i] == val)
      return i;
  }
  return -1;
}
```

The code looks a bit contrived because this patch will only vectorise if it can prove the loads in the early-exit loop will no fault. Here is the assembly for the vector portion of the loop:

```
.LBB0_16:
        rdvl    x10, #15
        mov     x9, xzr    
        and     x0, x10, #0x80
        mov     z0.b, w1
        ptrue   p0.b
        mov     x10, sp
        .p2align        5, , 16
.LBB0_17:
        ld1b    { z1.b }, p0/z, [x10, x9]
        cmpeq   p1.b, p0/z, z1.b, z0.b
        b.ne    .LBB0_21
        add     x9, x9, x8
        cmp     x0, x9
        b.ne    .LBB0_17
        cbz     x0, .LBB0_12
        mov     w0, #-1
        add     sp, sp, #128
        ret
.LBB0_21:
        brkb    p0.b, p0/z, p1.b
        incp    x9, p0.b
        mov     x0, x9
.LBB0_22:
        add     sp, sp, #128
        ret
```

Thanks!

https://github.com/llvm/llvm-project/pull/88385


More information about the llvm-commits mailing list