[PATCH] D32451: Improve profile-guided heuristics to use estimated trip count.
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 5 16:45:07 PDT 2017
Ayal added a comment.
Thanks Taewook for sharing the experimental results. What target was this run on?
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7715
+ // function entry baseline frequency. Note that we always have a canonical
+ // loop here because we think we *can* vectorize.
// FIXME: This is hidden behind a flag due to pervasive problems with
----------------
The original comment should in any case be updated to indicate that it's affecting the decision of optimizing-for-size cold loops.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7722
+ auto IsColdLoop = EstimatedTC ?
+ (*EstimatedTC < TinyTripCountVectorThreshold) :
+ (LoopEntryFreq < ColdEntryFreq);
----------------
If loop is known to iterate less than TinyTripCounterVectorThreshold, we avoid vectorizing it altogether, rather than vectorizing it with code-size constraints; unless vectorizing it is explicitly forced.
So should this
```
const unsigned MaxTC = SE->getSmallConstantMaxTripCount(L);
if (MaxTC > 0u && MaxTC < TinyTripCountVectorThreshold) {
...
}
```
be extended to use profiling where static analysis fails, e.g., by inserting the following between the top two lines above:
```
if (MaxTC == 0 && LoopVectorizeWithBlockFrequency) {
auto EstimatedTC = getLoopEstimatedTripCount(L);
if (EstimatedTC)
MaxTC = *EstimatedTC;
}
```
?
OTOH, setting OptForSize to true when the trip count is unknown effectively prevents vectorization, because an epilog is needed.
================
Comment at: test/Transforms/LoopVectorize/tripcount.ll:2
+; This test verifies that the loop vectorizer will not vectorizes low trip count
+; loops that require runtime checks (Trip count is computed with profile info).
+; REQUIRES: asserts
----------------
(As argued above, we expect loop not to be vectorized, rather than optimized for size.)
================
Comment at: test/Transforms/LoopVectorize/tripcount.ll:15
+; CHECK-NOT: <2 x i8>
+; CHECK-NOT: <4 x i8>
+
----------------
Better check instead that no vector types are generated, regardless of their size.
https://reviews.llvm.org/D32451
More information about the llvm-commits
mailing list