[llvm] [LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (PR #76172)

Fri Feb 2 14:34:52 PST 2024

================
@@ -4733,6 +4755,36 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
   // FIXME: look for a smaller MaxVF that does divide TC rather than masking.
   if (Legal->prepareToFoldTailByMasking()) {
     CanFoldTailByMasking = true;
+    if (getTailFoldingStyle() == TailFoldingStyle::None)
+      return MaxFactors;
+
+    if (UserIC > 1) {
+      LLVM_DEBUG(dbgs() << "LV: Preference for VP intrinsics indicated. Will "
+                           "not generate VP intrinsics since interleave count "
+                           "specified is greater than 1.\n");
+      return MaxFactors;
+    }
+
+    // FIXME: use actual opcode/data type for analysis here.
+    PreferEVL = MaxFactors.ScalableVF.isScalable() &&
+                getTailFoldingStyle() == TailFoldingStyle::DataWithEVL &&
----------------
ayalz wrote:

Why/is PreferEVL needed? Can getTailFoldingStyle() take care of checking whatever is needed in order to determine if DataWithEVL (or any other way of folding the tail) is desired, rather than having in addition `PreferEVL` which checks DataWithEVL (among these other considerations), which checks PreferEVL?

https://github.com/llvm/llvm-project/pull/76172