[PATCH] D99750: [LV, VP]VP intrinsics support for the Loop Vectorizer

Shiva Chen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 8 00:16:49 PST 2023


shiva0217 added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1690
+    return PreferVPWithVPEVLIntrinsics && !EnableVPlanNativePath &&
+           foldTailByMasking() && Legal->isSafeForAnyVectorWidth() &&
+           // FIXME: remove this once vp_reverse is supported.
----------------
If the loop contains reduction variables, there might need a mask to merge the last two iteration results. 

    int a[128];
    int foo (int end) {
      int size = 0;
      for (int i = 0; i < end; i++)
        size += a[i];
      return size;
    }

Should the case be guarded by `Legal->getReductionVars().empty() &&` ?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:1743
+  PHINode *EntryPart =
+      State.Builder.CreatePHI(Start->getType(), 2, "evl.based.iv");
+  EntryPart->addIncoming(Start, VectorPH);
----------------
There is a case that the PHI didnt' been inserted at top of basic block.

    int foo (int value, int *buf, int *end) {
      int *tmp;
      for (tmp = buf; tmp < end; tmp++)
        value -= *tmp;
      return value;
    }

Should we specify insertion point?
Something like:

    PHINode *EntryPart = PHINode::Create(
        Start->getType(), 2, "evl.based.iv", &*State.CFG.PrevBB->getFirstInsertionPt());


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:986
+  auto *WideCanonicalIV =
+      cast<VPWidenCanonicalIVRecipe>(*FoundWidenCanonicalIVUser);
+  // Walk users of WideCanonicalIV and replace all compares of the form
----------------
There is a case that VPWidenCanonicalIVRecipe didn't be generated with tail folding.

    int i;
    int foo (int q, int z)
    {
      int e = 0;
      while (z < 1)
        {
          q = z * 2;
          if (q != 0)
            for (i = 0; i < 2; ++i)
              e = 5;
          ++z;
        }
      return e;
    }

`for (i = 0; i < 2; ++i)` been simplifed as `store i32 2, ptr @i`.
Both pointer and store value are loop-invariant, so the mask(VPWidenCanonicalIVRecipe) might not be generated.
Should we suppress the replacement when the mask is not available?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99750/new/

https://reviews.llvm.org/D99750



More information about the llvm-commits mailing list