[PATCH] D133758: [VPlan] Add VPDerivedIVRecipe, use for VPScalarIVStepsRecipe.

Sun Nov 27 08:15:40 PST 2022

Ayal added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2348
   assert(ScalarIVTy == Step->getType() &&
          "Val and Step should have the same type");

----------------
Above assert is now redundant.

Can hoist the comment above and rephrase it, e.g., "Ensure scalar IV and step have the same integer type", or rather "Ensure step has the same type as that of scalar IV"?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9550
+  DerivedIV->setName("offset.idx");
+  if (TruncToTy && TruncToTy != DerivedIV->getType()) {
+    assert(Step->getType()->isIntegerTy() &&
----------------
nit: suffice to ask if TruncToTy != DerivedIV->getType() as the latter is never null?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9551
+  if (TruncToTy && TruncToTy != DerivedIV->getType()) {
+    assert(Step->getType()->isIntegerTy() &&
+           "Truncation requires an integer step");
----------------
assert(Step->getType()->isIntegerTy()) belongs earlier if still needed here at all?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9569
   Value *Step = State.get(getStepValue(), VPIteration(0, 0));
-  auto CreateScalarIV = [&](Value *&Step) -> Value * {
-    Value *ScalarIV = State.get(getCanonicalIV(), VPIteration(0, 0));
-    auto *CanonicalIV = State.get(getParent()->getPlan()->getCanonicalIV(), 0);
-    if (!isCanonical() || CanonicalIV->getType() != Ty) {
-      ScalarIV =
-          Ty->isIntegerTy()
-              ? State.Builder.CreateSExtOrTrunc(ScalarIV, Ty)
-              : State.Builder.CreateCast(Instruction::SIToFP, ScalarIV, Ty);
-      ScalarIV = emitTransformedIndex(State.Builder, ScalarIV,
-                                      getStartValue()->getLiveInIRValue(), Step,
-                                      IndDesc);
-      ScalarIV->setName("offset.idx");
-    }
-    if (TruncToTy) {
-      assert(Step->getType()->isIntegerTy() &&
-             "Truncation requires an integer step");
-      ScalarIV = State.Builder.CreateTrunc(ScalarIV, TruncToTy);
-      Step = State.Builder.CreateTrunc(Step, TruncToTy);
-    }
-    return ScalarIV;
-  };
+  if (Step->getType() != BaseIV->getType())
+    Step = State.Builder.CreateTrunc(Step, BaseIV->getType());
----------------
fhahn wrote:
> Ayal wrote:
> > Another type reconciliation worth taking care of by the callee instead of having it assert it? buildScalarSteps() in this case.
> Refactored in bf0bd85f9d82 and removed here, thanks!
> Refactored in bf0bd85f9d82 and removed here, thanks!

Good! Thanks!

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9521
+  Value *ScalarIV = State.get(getCanonicalIV(), VPIteration(0, 0));
+  Value *TransformedIV =
+      Ty->isIntegerTy()
----------------
fhahn wrote:
> Ayal wrote:
> > fhahn wrote:
> > > Ayal wrote:
> > > > Some things seem a bit confusing here, looking at the existing code:
> > > > 
> > > > VPScalarIVStepsRecipe::getCanonicalIV() presumably retrieves the same Recipe/VPValue as does the enclosing VPlan->getCanonicalIV()?
> > > > 
> > > > The original code has both `ScalarIV` and `CanonicalIV` - are they not the same - one retrieves a Value per lane (0,0) and the other per part (0) - used only to check its type?
> > > > 
> > > > Now `TransformedIV` is also "Scalar" (as in non-Vector) similar to `ScalarIV`.
> > > > 
> > > > Perhaps instead of
> > > > `Value *ScalarIV = State.get(getCanonicalIV(), VPIteration(0, 0));`
> > > > we should have
> > > > `Value *CanonicalIV = State.get(getCanonicalIV(), VPIteration(0, 0));`
> > > > ?
> > > > 
> > > > Perhaps instead of `TransformedIV` have `NonCanonicalIV`, `AffineIV` or `DerivedIV` - considering that the canonical IV is aka a "BasicIV"?
> > > > 
> > > > Then rename `VPTransformedIVRecipe` accordingly?
> > > > 
> > > > Would be good to explain somewhere all the IV recipes together: those representing a single scalar (canonical or not) across VF&UF, a single vector per part, a single scalar per lane.
> > > > VPScalarIVStepsRecipe::getCanonicalIV() presumably retrieves the same Recipe/VPValue as does the enclosing VPlan->getCanonicalIV()?
> > > 
> > > Yes we could also use `VPlan->getCanonicalIV()`, but it might be easier to follow if modeled explicitly?
> > > 
> > > > The original code has both ScalarIV and CanonicalIV - are they not the same - one retrieves a Value per lane (0,0) and the other per part (0) - used only to check its type?
> > > 
> > > 
> > > Yep, that should be cleaner in the new code.
> > > 
> > > > Perhaps instead of
> > > 
> > > Thanks, I updated the naming to use `CanonicalIV` and `DerivedIV`. I also renamed `VPTransformedIVRecipe` -> `VPDerivedIVRecipe`
> > > 
> > > > Would be good to explain somewhere all the IV recipes together: those representing a single scalar (canonical or not) across VF&UF, a single vector per part, a single scalar per lane.
> > > 
> > > Good idea, I'll see about that separately.
> > >> VPScalarIVStepsRecipe::getCanonicalIV() presumably retrieves the same Recipe/VPValue as does the enclosing VPlan->getCanonicalIV()?
> > 
> > > Yes we could also use VPlan->getCanonicalIV(), but it might be easier to follow if modeled explicitly?
> > 
> > Modeling the canonical IV as an explicit operand is fine. In fact, it then seems irrelevant if its Canonical or not - can simply refer to it as getBasicIV() or getIV()? (When there's a need to call isCanonical() then the CanonicalIV is needed, but that is the case in optimizeInductions() rather than here in DerivedIV.)
> > 
> > >> Would be good to explain somewhere all the IV recipes together: those representing a single scalar (canonical or not) across VF&UF, a single vector per part, a single scalar per lane.
> > 
> > > Good idea, I'll see about that separately.
> > 
> > Found a good place?
> Modeling the canonical IV as an explicit operand is fine. In fact, it then seems irrelevant if its Canonical or not - can simply refer to it as getBasicIV() or getIV()? (When there's a need to call isCanonical() then the CanonicalIV is needed, but that is the case in optimizeInductions() rather than here in DerivedIV.)
> 
> I think `emitTransformedIndex` needs the canonical IV.
> 
> > Found a good place?
> 
> I put up D138748 to add it to the VPHeaderPHIRecipe documentation.
> Modeling the canonical IV as an explicit operand is fine. In fact, it then seems irrelevant if its Canonical or not - can simply refer to it as getBasicIV() or getIV()? (When there's a need to call isCanonical() then the CanonicalIV is needed, but that is the case in optimizeInductions() rather than here in DerivedIV.)
> 
> I think `emitTransformedIndex` needs the canonical IV.

Hmm, `emitTransformedIndex` just needs an "Index", i.e., an "IV" or "BasicIV", regardless if it is "the canonical" IV.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1968
+class VPDerivedIVRecipe : public VPRecipeBase, public VPValue {
+  /// If not nullptr, truncate the generated values to TruncToTy.
+  Type *TruncToTy;
----------------
nit: it seems TruncToTy is always non-null, can simplify later check?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1976
+  VPDerivedIVRecipe(const InductionDescriptor &IndDesc, VPValue *Start,
+                    VPValue *CanonicalIV, VPValue *Step, Type *TruncToTy)
+      : VPRecipeBase(VPDerivedIVSC, {Start, CanonicalIV, Step}),
----------------
CanonicalIV >> BaseIV or IV? I.e., any Index that has a value per iteration rather than per-part/per-lane, regardless if it is "the canonical" IV?

TruncToTy >> ResultTy? I.e., always specifies the type of the result, never nullptr? 

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1875
+  /// Check if the induction described by \p ID is canonical, i.e. has a step of
+  /// 1 and  the same type and start as the canonical IV.
+  bool isCanonical(const InductionDescriptor &ID) const;
----------------
Ayal wrote:
> Ayal wrote:
> > nit: extra space "and  the"
> > 
> > step is also the same (1), as type and start; or does the original canonical unit step become UF * VF?
> I mean, would the following rephrasing be better:
> `/// i.e., has the same start, step (of 1), and type as the canonical IV.`
> ?
> 
> 
above nit worth addressing?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:1056
+                                         Type *Ty) const {
+  if (Ty != getScalarType() || ID.getInductionOpcode() != Instruction::Add)
+    return false;
----------------
> turns out the assertion triggered in some tests, added as an extra condition.

Good! Worth a comment?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:1058
+    return false;
+  // The start value of the steps-recipe must match the start value of the
+  // canonical induction and it must step by 1.
----------------
steps-recipe?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-forced.ll:22
 ; VPLANS-NEXT:     ACTIVE-LANE-MASK-PHI vp<%5> = phi vp<%3>, vp<%10>
-; VPLANS-NEXT:     vp<%6>    = SCALAR-STEPS vp<%4>, ir<0>, ir<1>
+; VPLANS-NEXT:     vp<%6>    = SCALAR-STEPS vp<%4>
 ; VPLANS-NEXT:     CLONE ir<%gep> = getelementptr ir<%ptr>, vp<%6>
----------------
Ayal wrote:
> This does appear less descriptive than having SCALAR-STEPS depict the step it uses?
Ah, the step it uses is explicit - it's ir<1>! What has been omitted is "start" (ir<0>), which is moved from ScalarIVSteps to DerivedIV, if needed.

================
Comment at: llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll:73
 ; DBG-NEXT:     EMIT vp<[[CAN_IV:%.+]]> = CANONICAL-INDUCTION
-; DBG-NEXT:     vp<[[STEPS1:%.+]]>    = SCALAR-STEPS vp<[[CAN_IV]]>, ir<false>, ir<true>
-; DBG-NEXT:     vp<[[STEPS2:%.+]]>    = SCALAR-STEPS vp<[[CAN_IV]]>, ir<0>, ir<1>
+; DBG-NEXT:     vp<[[DERIVED_IV:%.+]]> = DERIVED-IV vp<[[CAN_IV]]>, ir<false>, ir<true>
+; DBG-NEXT:     vp<[[STEPS1:%.+]]>    = SCALAR-STEPS vp<[[DERIVED_IV]]>
----------------
fhahn wrote:
> Ayal wrote:
> > fhahn wrote:
> > > Ayal wrote:
> > > > While we're here, `ir<false>`, `ir<true>` seem odd (and even ;-)) 
> > > I guess that's because they are boolean values and the IR get printed as boolean literals by the IR printer. Do you think this is something that should be changed?
> > Oh well, the input IR is adding and subtracting true and false...
> > 
> > The reason for having DERIVED-IV with canonical start `ir<false>` ==0 and step `ir<true>` ==1 is because of type expansion and/or truncation?
> Here the inputs are already truncated to `i1` before recipe construction so the recipe doesn't truncated itself.
Trying to clarify why DERIVED-IV is needed at all here (too), given that it starts at 0 (false) and bumps with a step of 1 (true)?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133758/new/

https://reviews.llvm.org/D133758