[llvm] [LV]Initial support for safe distance in predicated DataWithEVL vectorization mode. (PR #102897)

Mon Sep 30 12:20:29 PDT 2024

================
@@ -4071,15 +4093,25 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     InterleaveInfo.invalidateGroupsRequiringScalarEpilogue();
   }
 
-  FixedScalableVFPair MaxFactors = computeFeasibleMaxVF(MaxTC, UserVF, true);
+  // If we don't know the precise trip count, or if the trip count that we
+  // found modulo the vectorization factor is not zero, try to fold the tail
+  // by masking.
+  // FIXME: look for a smaller MaxVF that does divide TC rather than masking.
+  setTailFoldingStyles(UserIC);
+  FixedScalableVFPair MaxFactors =
+      computeFeasibleMaxVF(MaxTC, UserVF, foldTailByMasking());
----------------
alexey-bataev wrote:

In downstream we have a bit different implementation than in the upstream and not the bet one. Trying to improve it here.
> How many VPlans should be built in case of EVL, for what range of (scalable) VF's - ending with MaxVF?
Yes, bit_ceil(MaxVF).

> Suffice to consider a single VPlan for a single VF - the one corresponding to vector length computed dynamically by providing the original trip count and max safe distance - regardless of any MaxVF, both fixed and scalable?

Currently no, I think, it may affect the cost estimation.

> MaxVF and tail folding (yes/no/style): computeMaxVF() uses computeFeasibleMaxVF() which in turn uses getMaximizedVFForTarget() - the latter dependent on whether tail is folded or not - to limit MaxVF by the original trip count or not, and ends up being responsible for setting the tail style. Speculating a folded tail should produce greater (or equal) MaxVF's.

Need to define tail folding style before to avoid speculation here. Plus, it is required for non-power-of-2 distance support.

https://github.com/llvm/llvm-project/pull/102897