[PATCH] D155845: [VPlan] Fix in-loop reduction chains using VPlan def-use chains (NFCI)

Mon Jul 31 15:32:42 PDT 2023

Ayal added a comment.

Added some more comments to help clean this up more, can be applied now or later if preferred to keep the diff closer to a VPlan refactoring alone.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9161
+    if (MinVF.isScalar() && !PhiR->isOrdered())
       continue;

----------------
nit: Should this filtering be applied above when inserting into InLoopReductionPhis?
Should unordered reductions be considered non-isInLoop in VPlans of VF=1?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9172
+        if (!UserRecipe)
+          continue;
+        assert(UserRecipe->getNumDefinedValues() == 1 &&
----------------
note: may assert that UserRecipe resides inside the vector loop, as the outside user is a LiveOut, at-least for now.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9181
+    assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&
+           "select/cmp reductions are not allowed for in-loop reductions");
+
----------------
nit: assert better placed above, asap, independent of filling Worklist.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9194
+
+      // Index of the first operand which holds a non-mask vector operand.
+      unsigned IndexOfFirstOperand;
----------------
nit: would always considering the last two indices be (correct and) easier than considering either first two indices or indices 1 and 2? Excluding FMulAdd which deserves a separate treatment.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9203
+        if (auto *Cmp = dyn_cast<VPWidenRecipe>(CurrentLink)) {
+          assert(isa<CmpInst>(Cmp->getUnderlyingInstr()) &&
+                 "need to have the compare of the select");
----------------

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9206
+          continue;
+        }
+        IndexOfFirstOperand = 1;
----------------
assert CurrentLink is a select recipe (whose first operand index is 1)?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9210
+        assert((MinVF.isScalar() || isa<VPWidenRecipe>(CurrentLink) ||
+                (IsFMulAdd && isa<VPWidenCallRecipe>(CurrentLink))) &&
+               "Expected to replace a VPWidenSC or a FMulAdd intrinsic call");
----------------
Treat FMulAdd separately from the binary operators - the operand along the chain is in index 2, go ahead and generate the FMul of indices 0 and 1 to produce VecOp? (Keeping the assert message intact ;-)

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9220
+              : IndexOfFirstOperand;
+      VPValue *VecOp = CurrentLink->getOperand(VecOpId);

----------------
assert that VecOp != PreviousLinkV and that the other candidate does equal PreviousLinkV?
(`CurrentLink->getOperand(IndexOfFirstOperand+IndexOfFirstOperand+1-VecOpId)`)

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9225
       VPValue *CondOp = nullptr;
-      if (CM.blockNeedsPredicationForAnyReason(R->getParent())) {
+      if (CM.blockNeedsPredicationForAnyReason(BB)) {
         VPBuilder::InsertPointGuard Guard(Builder);
----------------
Wonder if this information may already be recorded in VPlan - perhaps BlockInMask already exists?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9233
         // If the instruction is a call to the llvm.fmuladd intrinsic then we
         // need to create an fmul recipe to use as the vector operand for the
         // fadd reduction.
----------------

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9235
         // fadd reduction.
         VPInstruction *FMulRecipe = new VPInstruction(
+            Instruction::FMul, {VecOp, CurrentLink->getOperand(1)});
----------------
assert VecOpId == 0?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9244
+      VPRecipeBase *CompareRecipe =
+          CurrentLink->getOperand(0)->getDefiningRecipe();
+
----------------
Could we use PreviousLink serve as/instead of CompareRecipe?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9232
         // need to create an fmul recipe to use as the vector operand for the
         // fadd reduction.
         VPInstruction *FMulRecipe = new VPInstruction(
----------------
fhahn wrote:
> Ayal wrote:
> > Should this breaking of fmuladd into an fadd recurrence fed by an fmul take place separately, as a preparatory step?
> > Should compare-select pairs be replaced by Min/Max recipes?
> > In order to convert an (initially widened) reduction recurrence into an in-loop one, it seems that isolating the reduction operation into a single, dedicated operation would be useful. It could also help canonicalize where to find the reduced value operands, aka vecOps.
> Might be good as follow up?
Sure, if not two.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155845/new/

https://reviews.llvm.org/D155845