[PATCH] D125301: [LoopVectorize] Add option to use active lane mask for loop control flow

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 25 01:38:57 PDT 2022


sdesmalen added inline comments.


================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:534
   /// intrinsic is supported.
-  bool emitGetActiveLaneMask() const;
+  bool emitGetActiveLaneMask(bool &UseForControlFlow) const;
 
----------------
I find this interface a bit cumbersome, can we make `emitGetActiveLaneMask` return an enum value that explains how the lane mask must be used, something like:

  enum class ActiveLaneMask {
    None = 0,
    PredicateOnly,
    PredicateAndControlFlow
  };

(but then with possibly better names)


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8108-8111
     // Introduce the early-exit compare IV <= BTC to form header block mask.
     // This is used instead of IV < TC because TC may wrap, unlike BTC. Start by
     // constructing the desired canonical IV in the header block as its first
     // non-phi instructions.
----------------
this comment is no longer correct.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8117
+        CM.TTI.emitGetActiveLaneMask(UseLaneMaskForControlFlow);
+    if (EmitGetActiveLaneMask && UseLaneMaskForControlFlow)
+      return BlockMaskCache[BB] = Plan->getActiveLaneMaskPhi();
----------------
Can you add a comment explaining what this does?

>From my understanding, it sets the predicate for the vectorized loop's header block (which would be the vectorized loop's main predicate) to the ActiveLaneMaskPhi cached in the Plan. Is that right?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8694
+                                  bool HasNUW, bool IsVPlanNative,
+                                  bool UseLaneMaskForControlFlow) {
   Value *StartIdx = ConstantInt::get(IdxTy, 0);
----------------
nit: s/UseLaneMaskForControlFlow/UseLaneMaskForLoopControlFlow`/


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:788
   }
-  case VPInstruction::BranchOnCount: {
+  case VPInstruction::CanonicalIVIncrementParts:
+  case VPInstruction::CanonicalIVIncrementPartsNUW: {
----------------
nit: s/CanonicalIVIncrementParts/CanonicalIVPIncrementForPart/ ?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:817
+      // sufficient for this.
+      Cond = Builder.CreateExtractElement(IV, Builder.getInt32(0));
+      Cond = Builder.CreateNot(Cond);
----------------
nit: IV is probably not a suitable name, because it's the active lane mask, not the induction variable.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1526
+    Instruction *StartMask;
+    {
+      IRBuilder<>::InsertPointGuard Guard(State.Builder);
----------------
Is this extra scope necessary? There are no uses of IR builder after this scope, so I think you can just as well use the scope of the loop body for the InsertPointGuard.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:816
     CanonicalIVIncrementNUW,
+    CanonicalIVIncrementParts,
+    CanonicalIVIncrementPartsNUW,
----------------
It would be nice to have a one-line description here of what this does and how it's different from CanonicalIVIncrement.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125301/new/

https://reviews.llvm.org/D125301



More information about the llvm-commits mailing list