[PATCH] D99750: [LV, VP]VP intrinsics support for the Loop Vectorizer

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 8 06:36:36 PST 2023


ABataev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1690
+    return PreferVPWithVPEVLIntrinsics && !EnableVPlanNativePath &&
+           foldTailByMasking() && Legal->isSafeForAnyVectorWidth() &&
+           // FIXME: remove this once vp_reverse is supported.
----------------
shiva0217 wrote:
> If the loop contains reduction variables, there might need a mask to merge the last two iteration results. 
> 
>     int a[128];
>     int foo (int end) {
>       int size = 0;
>       for (int i = 0; i < end; i++)
>         size += a[i];
>       return size;
>     }
> 
> Should the case be guarded by `Legal->getReductionVars().empty() &&` ?
Added


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:1743
+  PHINode *EntryPart =
+      State.Builder.CreatePHI(Start->getType(), 2, "evl.based.iv");
+  EntryPart->addIncoming(Start, VectorPH);
----------------
shiva0217 wrote:
> There is a case that the PHI didnt' been inserted at top of basic block.
> 
>     int foo (int value, int *buf, int *end) {
>       int *tmp;
>       for (tmp = buf; tmp < end; tmp++)
>         value -= *tmp;
>       return value;
>     }
> 
> Should we specify insertion point?
> Something like:
> 
>     PHINode *EntryPart = PHINode::Create(
>         Start->getType(), 2, "evl.based.iv", &*State.CFG.PrevBB->getFirstInsertionPt());
Fixed in VPlanTransforms.cpp by inserting the recipe immediately after CanonicalIVPHI.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:986
+  auto *WideCanonicalIV =
+      cast<VPWidenCanonicalIVRecipe>(*FoundWidenCanonicalIVUser);
+  // Walk users of WideCanonicalIV and replace all compares of the form
----------------
shiva0217 wrote:
> There is a case that VPWidenCanonicalIVRecipe didn't be generated with tail folding.
> 
>     int i;
>     int foo (int q, int z)
>     {
>       int e = 0;
>       while (z < 1)
>         {
>           q = z * 2;
>           if (q != 0)
>             for (i = 0; i < 2; ++i)
>               e = 5;
>           ++z;
>         }
>       return e;
>     }
> 
> `for (i = 0; i < 2; ++i)` been simplifed as `store i32 2, ptr @i`.
> Both pointer and store value are loop-invariant, so the mask(VPWidenCanonicalIVRecipe) might not be generated.
> Should we suppress the replacement when the mask is not available?
Fixed, added the test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99750/new/

https://reviews.llvm.org/D99750



More information about the llvm-commits mailing list