[PATCH] D105673: [SelectionDAG] Fix the representation of ISD::STEP_VECTOR.

Fri Jul 9 03:05:57 PDT 2021

paulwalker-arm added a comment.

When the node was first added I figured it wouldn't be too long before somebody wanted to support a variable stride, so using an SDValue made sense so as to save some effort later.  Personally I've never seen a need to support a variable stride and time has passed without such a request so I think locking the interface down now is a good move.

I'm still on holiday so have not done a proper review, just added a couple of observations.

================
Comment at: llvm/include/llvm/CodeGen/SelectionDAG.h:838
   ///   <0, Step, Step * 2, Step * 3, ...>
-  SDValue getStepVector(const SDLoc &DL, EVT ResVT, SDValue Step);
+  SDValue getStepVector(const SDLoc &DL, EVT ResVT, APInt StepVal);

----------------
We couldn't do this before because the type of `Step` was important.  This is no longer the case so is it worth giving `StepVal` a default of `1`? as a connivence.  I know there's the assert that the sizes match but adding `|| StepVal.isOneValue()` doesn't seem so bad.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1751
+SDValue SelectionDAG::getStepVector(const SDLoc &DL, EVT ResVT, APInt StepVal) {
+  assert(ResVT.getScalarSizeInBits() == StepVal.getBitWidth());
   if (ResVT.isScalableVector())
----------------
Now that you've define the overflow semantics, do these have to match? There is the `StepVal * i` part in the build vector block but that could `sextOrTrunc` `StepVal`?

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1382-1389
+  // add(step_vector(step), dup(X)) -> index(X, step).
+  def : Pat<(add (nxv16i8 (step_vector_oneuse i8:$Rm)), (nxv16i8 (AArch64dup(i32 GPR32:$Rn)))),
+            (INDEX_RR_B GPR32:$Rn, (MOVi32imm (trunc_imm $Rm)))>;
+  def : Pat<(add (nxv8i16 (step_vector_oneuse i16:$Rm)), (nxv8i16 (AArch64dup(i32 GPR32:$Rn)))),
+            (INDEX_RR_H GPR32:$Rn, (MOVi32imm (trunc_imm $Rm)))>;
+  def : Pat<(add (nxv4i32 (step_vector_oneuse i32:$Rm)), (nxv4i32 (AArch64dup(i32 GPR32:$Rn)))),
+            (INDEX_RR_S GPR32:$Rn, (MOVi32imm $Rm))>;
----------------
Why move the patterns outside of their multiclasess? What am I missing here?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105673/new/

https://reviews.llvm.org/D105673