[PATCH] D100549: [RISCV] Lower vector shuffles to vrgather operations

Fri Apr 16 01:44:37 PDT 2021

frasercrmck marked 3 inline comments as done.
frasercrmck added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:1572
+  bool SwapOps = DAG.isSplatValue(V2) && !DAG.isSplatValue(V1);
+  bool InvertMask = (IsSelect && SwapOps) || (!IsSelect && !SwapOps);
+
----------------
craig.topper wrote:
> This might be clearer as "IsSelect == SwapOps"
Hah yep.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:1607
+    // FIXME: We could promote the index to i16 and use vrgatherei16, but that
+    // may involve vector splitting if we're already at LMUL=8.
+    return SDValue();
----------------
craig.topper wrote:
> or if max LMUL is capped.
good point, cheers.

================
Comment at: llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td:146

 def riscv_vrgather_vx_vl : SDNode<"RISCVISD::VRGATHER_VX_VL",
+                                  SDTypeProfile<1, 5, [SDTCisVec<0>,
----------------
craig.topper wrote:
> Should we change the names of these to include _MASK or something? This is now difference than the other _VL ISD opcodes and many of the others were set up to match the VP SDNodes.
> 
> Alternatively, we pattern match VSELECT_VL+VRGATHER_VX_VL/VRGATHER_VV_VL? I think that's how VP is intended to work.
I was wondering about that. My original version added a separate MASK version. I think maybe it depends on what we want to do with the VP nodes as a whole. Since we haven't really answered that question (I'm planning on looking into VP support shortly) I didn't want this patch to be the bellweather for how we lower VP as we should be able to change our minds if another strategy proves better.

Perhaps the best idea for now is the VSELECT one. It doesn't change the behaviour of the existing nodes, doesn't introduce new ones, and may "just work" with our VP lowering strategy. I don't think it'd increase the patterns any more than I've already done with the extra non-truemask pats.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll:104
+; CHECK-NEXT:    vrgather.vv v25, v9, v26, v0.t
+; CHECK-NEXT:    vmv1r.v v8, v25
 ; CHECK-NEXT:    ret
----------------
craig.topper wrote:
> Depending on how costly a vrgather is, this doesn't look like much of an improvement.
> 
> It's also only using 1 element from vector 2. Would that be cheaper to just extract+insert into place?
Yeah there's definitely a couple of heuristics we can chuck at this to optimize certain cases. Shuffles are one of those things that's hard to get right first try.

As you say, one of them is having a sense of how many elements are used by each operand, much like we do in BUILD_VECTOR lowering. Single-element manipulation will be better in certain cases.

I think another could be to detect "almost splats"; where one operand is a full splat and the other is shuffled, or when either operand is almost splatted save for a handful of elements.

I'll be coming back to shuffles over the next wee while. Do you think this is something to do now or leave as a follow-up?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100549/new/

https://reviews.llvm.org/D100549