[PATCH] D124869: [WIP][RISCV] Hoist VSETVLI out of idiomatic fixed length vector loops
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 3 12:32:40 PDT 2022
reames created this revision.
reames added reviewers: craig.topper, khchen, Chenbing.Zheng, jacquesguan.
Herald added subscribers: sunshaoce, VincentWu, luke957, StephenFan, vkmr, frasercrmck, evandro, luismarques, apazos, sameer.abuasal, s.egerton, Jim, benna, psnobl, jocewei, PkmX, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, kito-cheng, niosHD, sabuasal, bollu, simoncook, johnrusso, rbar, asb, hiraditya, arichardson, mcrosier.
Herald added a project: All.
reames requested review of this revision.
Herald added subscribers: pcwang-thead, eopXD, MaskRay.
Herald added a project: LLVM.
This is WIP with a bunch of cleanup/splitting and tests needed; posting early as I'm very new to the riscv targeted and want to make sure this is a reasonable idea.
This patch teaches the VSETVLI insertion pass to perform a very limited form of partial redundancy elimination. The motivating example comes from the fixed length vectorization of a simple loop such as:
for (unsigned i = 0; i < a_len; i++)
a[i] += b;
Without this change, the core vector loop and preheader is as follows:
.LBB0_3: # %vector.ph
andi a1, a6, -8
addi a4, a0, 16
mv a5, a1
.LBB0_4: # %vector.body
# =>This Inner Loop Header: Depth=1
addi a3, a4, -16
vsetivli zero, 4, e32, m1, ta, mu
vle32.v v8, (a3)
vle32.v v9, (a4)
vadd.vx v8, v8, a2
vadd.vx v9, v9, a2
vse32.v v8, (a3)
vse32.v v9, (a4)
addi a5, a5, -8
addi a4, a4, 32
bnez a5, .LBB0_4
The key thing to note here is that, I believe, the execution of the vsetivli only needs to happen once. Since there's no tail folding happening here, the value of the vector configuration registers are invariant through the loop.
After this patch, we hoist the configuration into the preheader and perform it once.
.LBB0_3: # %vector.ph
andi a1, a6, -8
vsetivli zero, 4, e32, m1, ta, mu
addi a4, a0, 16
mv a5, a1
.LBB0_4: # %vector.body
# =>This Inner Loop Header: Depth=1
addi a3, a4, -16
vle32.v v8, (a3)
vle32.v v9, (a4)
vadd.vx v8, v8, a2
vadd.vx v9, v9, a2
vse32.v v8, (a3)
vse32.v v9, (a4)
addi a5, a5, -8
addi a4, a4, 32
bnez a5, .LBB0_4
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D124869
Files:
llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
llvm/test/CodeGen/RISCV/rvv/fixed-vector-strided-load-store.ll
llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll
llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D124869.426801.patch
Type: text/x-patch
Size: 35948 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220503/ccce8830/attachment.bin>
More information about the llvm-commits
mailing list