[PATCH] D146711: [RISCV] Lower insert subvector shuffles as vslideups

Fri Mar 24 07:53:56 PDT 2023

reames accepted this revision.
reames added a comment.
This revision is now accepted and ready to land.

LGTM w/minor comments.

I'm explicitly okay with the TU/TA issue being handled in a follow up commit.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:3323
+// ->
+// vsetvli zero, 8, e8, mf2, tu,ma
+// vslideup.vi v8, v9, 4
----------------
white space on the ",  tu," bit

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:3356
+      DAG.getConstant(NumSubElts + Index, DL, XLenVT),
+      RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED | RISCVII::MASK_AGNOSTIC);
+  return convertFromScalableVector(VT, Slideup, DAG, Subtarget);
----------------
In the case where you're inserting a suffix into the InPlace vector, you don't need the TU here and can use TA instead.   (e.g. you slide up into the last 4 elements of a 8 long vector)

We may already catch that via a DAG combine or during vsetvli insertion, not sure.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll:648

-define <8 x i8> @merge_start_into_end(<8 x i8> %v, <8 x i8> %w) {
-; CHECK-LABEL: merge_start_into_end:
+define <8 x i8> @slideup_into_end(<8 x i8> %v, <8 x i8> %w) {
+; CHECK-LABEL: slideup_into_end:
----------------
Please land the test rename separately.

I'd suggest something more generic like concat_4xi8 to avoid the strategy used needing a test rename.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll:651
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, tu, ma
+; CHECK-NEXT:    vslideup.vi v8, v9, 4
----------------
This is the TA case I'd mentioned above, looks like we don't catch this.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shufflevector-vnsrl.ll:155
 ; ZVE32F-NEXT:    vslidedown.vi v9, v8, 2
-; ZVE32F-NEXT:    li a0, 2
-; ZVE32F-NEXT:    vmv.s.x v0, a0
-; ZVE32F-NEXT:    vrgather.vi v10, v8, 0
-; ZVE32F-NEXT:    vrgather.vi v10, v9, 0, v0.t
-; ZVE32F-NEXT:    vse32.v v10, (a1)
+; ZVE32F-NEXT:    vsetvli zero, zero, e32, m1, tu, ma
+; ZVE32F-NEXT:    vslideup.vi v8, v9, 1
----------------
luke wrote:
> So `rd=x0, rs1=x0` means to keep the existing VL. Could that mean these two vsetivli's could be merged into one `vsetivli zero, 2, e32, m1, tu, ma`, seeing as the second `tu` trumps the former `ta`?
Yes, though we can also just do better and use a single slidedown to form the shuffle here.  

Seems like a reasonable follow up patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146711/new/

https://reviews.llvm.org/D146711