[PATCH] D133760: [VPlan] Support sinking VPScalarIVStepsRecipe.

Sat Dec 3 16:18:52 PST 2022

fhahn added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:128
     VPBasicBlock *SinkTo;
     VPValue *C;
     std::tie(SinkTo, C) = WorkList.pop_back_val();
----------------
Ayal wrote:
> Worth clarifying/renaming that C is a SinkCandidateValue produced by SinkCandidate[Recipe]. Or use only the latter: only single-value producing recipes are currently being sunk, so WorkList could hold such recipes (subclasses of VPValues) rather than VPValues, filtering live-in operands when pushing into it.
Renamed, thanks!

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:130
     std::tie(SinkTo, C) = WorkList.pop_back_val();
-    auto *SinkCandidate = dyn_cast_or_null<VPReplicateRecipe>(C->Def);
-    if (!SinkCandidate || SinkCandidate->isUniform() ||
-        SinkCandidate->getParent() == SinkTo ||
-        SinkCandidate->mayHaveSideEffects() ||
-        SinkCandidate->mayReadOrWriteMemory())
+    VPRecipeBase *SinkCandidate = nullptr;
+    if (auto *ScalarSteps = dyn_cast_or_null<VPScalarIVStepsRecipe>(C->Def))
----------------
Ayal wrote:
> Simplify a bit and check if C is not live-in?
> ```
>   VPRecipeBase *SinkCandidate = cast_or_null<VPRecipeBase>(C->getDef());
>   if (!SinkCandidate || SinkCandidate->getParent() == SinkTo ||
>       SinkCandidate->mayHaveSideEffects() || SinkCandidate->mayReadOrWriteMemory())
>     continue;
>   if (auto *RepR = dyn_cast<VPReplicateRecipe>(SinkCandidate)) {
>     if (RepR->isUniform())
>       continue;
>   } else if (!isa<VPScalarIVStepsRecipe>(SinkCandidate))
>     continue;
> ```
Simplified, thanks!

================
Comment at: llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll:2955
 ; UNROLL-NO-VF-NEXT:    [[VEC_IV:%.*]] = add i32 [[INDEX]], 0
-; UNROLL-NO-VF-NEXT:    [[VEC_IV3:%.*]] = add i32 [[INDEX]], 1
+; UNROLL-NO-VF-NEXT:    [[VEC_IV2:%.*]] = add i32 [[INDEX]], 1
 ; UNROLL-NO-VF-NEXT:    [[TMP2:%.*]] = icmp ule i32 [[VEC_IV]], [[TRIP_COUNT_MINUS_1]]
----------------
Ayal wrote:
> Are these renamings important and related to this patch or can these changes be dropped?
Not sure how this sneaked into the diff, it is not needed for the patch and I removed it.

================
Comment at: llvm/test/Transforms/LoopVectorize/float-induction.ll:1411
 ; VEC4_INTERL1:       pred.store.if2:
-; VEC4_INTERL1-NEXT:    [[TMP7:%.*]] = fadd fast float [[TMP0]], 1.000000e+00
 ; VEC4_INTERL1-NEXT:    [[TMP8:%.*]] = or i64 [[INDEX]], 1
+; VEC4_INTERL1-NEXT:    [[TMP7:%.*]] = fadd fast float [[TMP0]], 1.000000e+00
----------------
Ayal wrote:
> Sinking order appears to have changed, but is (still) deterministic, right?
Yes I think the difference here is because we now sink parts using VPlan sinking earlier.

================
Comment at: llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll:577
 ; CHECK-NEXT:   EMIT vp<[[MASK3:%.+]]> = select vp<[[MASK1]]> vp<[[NOT]]> ir<false>
-; CHECK-NEXT:   BLEND %p = ir<0>/vp<%14> vp<%12>/vp<%9>
+; CHECK-NEXT:   BLEND %p = ir<0>/vp<[[MASK3]]> vp<[[PRED]]>/vp<[[MASK2]]>
 ; CHECK-NEXT: Successor(s): then.1
----------------
Ayal wrote:
> Is this change of having the BLEND use MASK3, PRED, MASK2 instead of vp<%14,%12,%9> related to this patch focused on sinking STEPS, or can it be dropped?
Directly matching the numbered VPValues broke due to different numbering after moving the SCALAR-STEPS recipe. I'll split this change off separately. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133760/new/

https://reviews.llvm.org/D133760