[llvm] [LV]Set tailfolding styles before computing feasible max VF. (PR #91403)
via llvm-commits
llvm-commits at lists.llvm.org
Sat Jul 13 15:02:59 PDT 2024
================
@@ -4434,6 +4464,11 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
InterleaveInfo.invalidateGroupsRequiringScalarEpilogue();
}
+ // If we don't know the precise trip count, or if the trip count that we
+ // found modulo the vectorization factor is not zero, try to fold the tail
+ // by masking.
+ // FIXME: look for a smaller MaxVF that does divide TC rather than masking.
----------------
ayalz wrote:
This comment best continue to appear below before the `if (foldTailByMasking())` part which deals with actually folding the tail, rather than here which tries to avoid tail folding if the precise trip count is known - to be a multiple of any VF we choose, possibly times UserIC? (i.e., not necessarily a power of 2)
Some other comment is needed here to explain why tail folding style is being set here (before being sure there is a tail, possibly to be reset below when we're sure there isn't), before calling `computeFeasibleMaxVF(MaxTC, UserVF, /* FoldTail */ true)`, rather than below, before the first time it is checked explicitly via `foldTailByMasking()`.
Perhaps that last boolean parameter of computeFeasibleMaxVF() is insufficient/redundant?
https://github.com/llvm/llvm-project/pull/91403
More information about the llvm-commits
mailing list