[PATCH] D126884: [RISCV] Hoist vsetvli with vreg operand out of loops

Thu Jun 2 07:54:27 PDT 2022

reames created this revision.
reames added reviewers: frasercrmck, craig.topper, kito-cheng.
Herald added subscribers: sunshaoce, VincentWu, luke957, StephenFan, vkmr, evandro, luismarques, apazos, sameer.abuasal, s.egerton, Jim, benna, psnobl, jocewei, PkmX, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, niosHD, sabuasal, bollu, simoncook, johnrusso, rbar, asb, hiraditya, arichardson, mcrosier.
Herald added a project: All.
reames requested review of this revision.
Herald added subscribers: pcwang-thead, eopXD, MaskRay.
Herald added a project: LLVM.

This is an extension to the PRE line of work.  The goal is to be able to handle large constant AVLs which are too large to be encoded in a five bit immediate and thus need to be materialized in register.  Handling other register cases is a nice to have.

This patch removes a predicate from the correctness logic for the PRE transform, and replaces it with an alternate one.  The code being removed was phrased as "having a fixed result"; it's being replaced with a standard (if handrolled to avoid needing dominators)  availability check.

When I wrote the original code, I was concerned about three cases.  Thinking about each further, I think I was simply being too conservative from the beginning.

First, we might be in a loop where the VL is potentially changing.  The availability check handles this as the value would not be available in the preheader in this case.  PREing into a rarer path in the loop is fine, as by assumption the final VL state is the same along all paths.

Second, vsetvli is undefined in certain cases - specifically the vsetvli x0, x0, vtype case.  If we moved it above control flow, could we introduce a fault?  I believe the answer to be yes, but the key point here is that we're *not* moving an existing instruction.  We're moving a state transition.  If VL actually  changes, the insert code would not use the x0, x0 form at the new point.  (And couldn't at the original either since VL could change on at least one incoming path.)

To say this point differently, we don't need the standard "anticipation" check from PRE as the data flow and insertion logic covers that bit for us.  By construction, there isn't some unmodeled control dependence for the "instruction being moved".  Or said differently, state transitions are always safe to speculate.

Third, the specification leaves the hardware significant flexibility to handle VLMAX < AVL < 2 x VLMAX.  Originally, I think my concern had been that hoisting such cases might allow different VL values, but thinking about this further, the spec explicitly requires determinism for each AVL value.  Given that, we'd only run into trouble if we considered two different AVLs (which we'd somehow falsely proven to produce the same VL) along different paths.  Doing this would be an unrelated bug in the dataflow, and thus we don't have to account for it here.

Skeptical review to make sure I haven't missed something here is appreciated.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D126884

Files:
  llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
  llvm/test/CodeGen/RISCV/rvv/fixed-vector-strided-load-store-asm.ll
  llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D126884.433748.patch
Type: text/x-patch
Size: 11519 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220602/830cbf2f/attachment.bin>