[PATCH] D99750: [LV, VP] RFC: VP intrinsics support for the Loop Vectorizer (Proof-of-Concept)

Fri Apr 16 09:03:13 PDT 2021

bmahjour added a comment.

The overall approach looks good to me. I might have gone a little too deep into the details with my inline comments, but the direction makes sense to me.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:280
+                   "is not supported. This option is discouraged."),
+        clEnumValN(PreferVPIntrinsicsTy::ForceAVLSupport,
+                   "force-active-vector-length-support",
----------------
don't see ForceAVLSupport being used anywhere in this patch.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:628
+  /// intrinsics.
+  Value *createEVL();
+
----------------
Could this be moved to the `VPWidenEVLRecipe` class? 

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2931
+  if (Reverse)
+    assert(!EVL &&
+           "Vector reverse not supported for predicated vectorization.");
----------------
how do we guard against this situation? I think this should be part of the legality check...eg `isLegalMaskedLoad` check for `Legal->isConsecutivePtr(Ptr)`...if we had a function similar to isLegalMaskedLoad  (say isLegalEVLLoad), then it could check for -1 stride and return false...then we'd be guaranteed not to hit this assert.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9017
+  if (isa<LoadInst>(Instr) || isa<StoreInst>(Instr)) {
+    if (preferPredicatedWiden())
+      return toVPRecipeResult(tryToPredicatedWidenMemory(Instr, Range, Plan));
----------------
not sure if this is already in your todo list...but apart from "preferring" to predicate, we need to check that the predication is legal for the target platform. This should probably be done in `isScalarWithPredication` with calls similar to `isLegalMaskedLoad`.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9697
+
+  Value *Remaining = Builder.CreateSub(TripCount, Induction);
+  // FIXME: This is a proof-of-concept naive implementation to demonstrate using
----------------
Value *Remaining = Builder.CreateSub(TripCount, Induction, "tc.minus.iv");

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9712
+
+  Value *RuntimeVLExt = Builder.CreateZExt(RuntimeVL, Remaining->getType());
+  Value *EVL =
----------------
can we try to initialize RuntimeVL with the right type, to avoid having to zero extend it here?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9714
+  Value *EVL =
+      Builder.CreateBinaryIntrinsic(Intrinsic::umin, RuntimeVLExt, Remaining);
+  return Builder.CreateTrunc(EVL, Builder.getInt32Ty());
----------------
you can avoid calling the intrinsic, by just doing a `cmp` and a `select`. (ie check that Remaining is less than RuntimeVL and if so pick Ramining, otherwise select RuntimeVL).

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9715
+      Builder.CreateBinaryIntrinsic(Intrinsic::umin, RuntimeVLExt, Remaining);
+  return Builder.CreateTrunc(EVL, Builder.getInt32Ty());
+}
----------------
Can we avoid truncating to 32-bit on 64-bit targets that take in 64-bit length? I think the type should be the same as the IV.

================
Comment at: llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h:74
+  /// predicated recipe takes mandatory mask and EVL VPInstructions.
+  VPRecipeBase *tryToPredicatedWidenMemory(Instruction *I, VFRange &Range,
+                                           VPlanPtr &Plan);
----------------
rename to `tryToPredicateWidenMemory` or `tryToWidenPredicatedMemory`

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1144
+    raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) const {
+  O << Indent << "PREDICATED-WIDEN ";
+
----------------
"PREDICATED-WIDEN-MEMORY" to distinguish from VPPredicatedWidenRecipe

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1152
+
+  printOperands(O, SlotTracker);
+}
----------------
I suppose the EVL is not one of the operands, right? If so, I think printing the EVL here would be useful too.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99750/new/

https://reviews.llvm.org/D99750