[PATCH] D105673: [SelectionDAG] Fix the representation of ISD::STEP_VECTOR.

Mon Jul 12 11:50:29 PDT 2021

efriedma added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:4205

+void DAGTypeLegalizer::ExpandIntRes_VSCALE(SDNode *N, SDValue &Lo,
+                                           SDValue &Hi) {
----------------
david-arm wrote:
> Hi @efriedma, the commit message didn't mention anything about changing the legalisation for VSCALE. Is it possible to either 1) add something to commit message describing why we've also had to fix VSCALE legalisation as part of the patch, or 2) is it possible to do this in a separate patch?
VSCALE legalization shows up as a consequence of the legalization changes on RV32.  I guess I can split it off; an explicit call to llvm.vscale or whatever would probably trigger the same thing.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1751
+SDValue SelectionDAG::getStepVector(const SDLoc &DL, EVT ResVT, APInt StepVal) {
+  assert(ResVT.getScalarSizeInBits() == StepVal.getBitWidth());
   if (ResVT.isScalableVector())
----------------
paulwalker-arm wrote:
> Now that you've define the overflow semantics, do these have to match? There is the `StepVal * i` part in the build vector block but that could `sextOrTrunc` `StepVal`?
Allowing an APInt with the wrong width seems like it's just asking for trouble; even if we can make it "work", likely the caller has a bug.

That said, it would be reasonable to add an overload that just defaults the step to 1; we repeat that pattern all over.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1382-1389
+  // add(step_vector(step), dup(X)) -> index(X, step).
+  def : Pat<(add (nxv16i8 (step_vector_oneuse i8:$Rm)), (nxv16i8 (AArch64dup(i32 GPR32:$Rn)))),
+            (INDEX_RR_B GPR32:$Rn, (MOVi32imm (trunc_imm $Rm)))>;
+  def : Pat<(add (nxv8i16 (step_vector_oneuse i16:$Rm)), (nxv8i16 (AArch64dup(i32 GPR32:$Rn)))),
+            (INDEX_RR_H GPR32:$Rn, (MOVi32imm (trunc_imm $Rm)))>;
+  def : Pat<(add (nxv4i32 (step_vector_oneuse i32:$Rm)), (nxv4i32 (AArch64dup(i32 GPR32:$Rn)))),
+            (INDEX_RR_S GPR32:$Rn, (MOVi32imm $Rm))>;
----------------
paulwalker-arm wrote:
> Why move the patterns outside of their multiclasess? What am I missing here?
The multiclasses were making it hard to understand what was actually going on because none of the names made sense, I was getting errors about unknown names, etc.  I have a better handle on what it's doing, now; I think I could try to move it back.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105673/new/

https://reviews.llvm.org/D105673