[PATCH] D107957: [LegalizeTypes][VP] Add splitting support for binary VP ops

Thu Sep 2 01:29:17 PDT 2021

frasercrmck added inline comments.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/vadd-vp.ll:1627
+; FIXME: The first vadd.vi should be able to infer that its AVL is equivalent to VLMAX.
+; FIXME: The second vsetvli is incorrectly preserving the VL on RV64, rather
+; than setting it to 0 as in RV32.
----------------
craig.topper wrote:
> craig.topper wrote:
> > frasercrmck wrote:
> > > frasercrmck wrote:
> > > > @craig.topper, do you have any thoughts about this issue? I don't think it's specific to this patch but rather is just exposed by it. We could probably generated a reduced test case on the current `main`.
> > > FWIW this is a problem in `main` but only in the sense that we //could// create something like
> > > 
> > > ```
> > >     %0:gpr = COPY $x0
> > >     dead %1:gpr = PseudoVSETVLI %0, 89, implicit-def $vl, implicit-def $vtype
> > > ```
> > > 
> > > which can then be register-coalesced into `PseudoVSETVLI $x0`. We tend not to generate this sort of thing. It's just that in this example the `COPY $x0` isn't immediately apparent during instruction selection due to the pseudo instructions.
> > > 
> > > I couldn't find a way of stopping the coalescer do its job here. It's legal from an ISA to do it, but it's not from a semantics point-of-view. It's almost like we need two different `PseudoVSETVLI` instructions: one that takes `x0` and one that doesn't.
> > I'm looking into this.
> This specific case should be gone after ccbb4c8b4ffd00588f0c21c7e5208bf210b26a53 This doesn't fix the underlying issue that the coalescer was allowed to do this. It just fixes the specific circumstance that allowed a copy from x0 to be an input to a vector instruction. We'll now see the 0 earlier and use vsetivli.
Thanks, @craig.topper! I started fixing the SELECT_CC as you did but didn't want to lose the motivation to fix our VSETVLIs the "right way" before we'd had a chance to discuss what to do.

Ignoring other machine passes that may do the same illegal transform, I don't know if it is possible to fix the coalescer to do the same thing. In some ways we're lucky it's only a simple X0/no-X0 split. It'd be tricky if there were more complicated constraints going on.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107957/new/

https://reviews.llvm.org/D107957