[llvm] [LV]Set tailfolding styles before computing feasible max VF. (PR #91403)

Sat Jul 13 15:02:59 PDT 2024

================
@@ -4434,6 +4464,11 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     InterleaveInfo.invalidateGroupsRequiringScalarEpilogue();
   }
 
+  // If we don't know the precise trip count, or if the trip count that we
+  // found modulo the vectorization factor is not zero, try to fold the tail
+  // by masking.
+  // FIXME: look for a smaller MaxVF that does divide TC rather than masking.
----------------
ayalz wrote:

This comment best continue to appear below before the `if (foldTailByMasking())`  part  which deals with actually folding the tail, rather than here which tries to avoid tail folding if the precise trip count is known - to be a multiple of any VF we choose, possibly times UserIC? (i.e., not necessarily a power of 2)

Some other comment is needed here to explain why tail folding style is being set here (before being sure there is a tail, possibly to be reset below when we're sure there isn't), before calling `computeFeasibleMaxVF(MaxTC, UserVF, /* FoldTail */ true)`, rather than below, before the first time it is checked explicitly via `foldTailByMasking()`.

Perhaps that last boolean parameter of computeFeasibleMaxVF() is insufficient/redundant?

https://github.com/llvm/llvm-project/pull/91403