[PATCH] D159332: [RISCV] Cap build vector cost to avoid quadratic cost at high LMULs
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 31 20:35:30 PDT 2023
reames created this revision.
reames added reviewers: luke, craig.topper, asb.
Herald added subscribers: jobnoorman, sunshaoce, VincentWu, vkmr, frasercrmck, luismarques, apazos, sameer.abuasal, s.egerton, Jim, benna, psnobl, jocewei, PkmX, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, bollu, simoncook, johnrusso, rbar, hiraditya, arichardson, mcrosier.
Herald added a project: All.
reames requested review of this revision.
Herald added subscribers: wangpc, eopXD, MaskRay.
Herald added a project: LLVM.
(Still somewhat WIP - posted for feedback, and frankly to grab a phab revision)
Each vslide operation is linear in LMUL on common hardware. (For instance, the sifive-x280 cost model models slides this way.) If we do a VL unique inserts, each with a cost linear in LMUL, the overall cost is O(LMUL2) * VLEN/ETYPE. To avoid the degenerate case, fallback to the stack if the cost is more than a fixed (linear) threshold.
For context, here's the sifive-x280 llvm-mca results for the current lowering and stack based lowering for each LMUL (using e64). Assumes code was compiled for V (i.e. zvl128b).
output/sifive-x280/buildvector_m1_via_stack.mca:Total Cycles: 1904
output/sifive-x280/buildvector_m2_via_stack.mca:Total Cycles: 2104
output/sifive-x280/buildvector_m4_via_stack.mca:Total Cycles: 2504
output/sifive-x280/buildvector_m8_via_stack.mca:Total Cycles: 3304
output/sifive-x280/buildvector_m1_via_vslidedown.mca:Total Cycles: 804
output/sifive-x280/buildvector_m2_via_vslidedown.mca:Total Cycles: 1604
output/sifive-x280/buildvector_m4_via_vslide1down.mca:Total Cycles: 6400
output/sifive-x280/buildvector_m8_via_vslide1down.mca:Total Cycles: 25599
There are other schemes we could use to cap the cost. The next best is recursive decomposition of the vector into smaller LMULs. That's still quadratic, but with a better constant. However, stack based seems to cost better on all LMULs, so we can just go with the simpler scheme.
Arguably, this patch is fixing a regression introduced with my D149667 <https://reviews.llvm.org/D149667> as before that change, we'd always fallback to the stack, and thus didn't have the non-linearity.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D159332
Files:
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D159332.555244.patch
Type: text/x-patch
Size: 48922 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230901/df74f40b/attachment.bin>
More information about the llvm-commits
mailing list