[PATCH] D32451: Improve profile-guided heuristics to use estimated trip count.

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 5 16:45:07 PDT 2017


Ayal added a comment.

Thanks Taewook for sharing the experimental results. What target was this run on?



================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7715
+  // function entry baseline frequency. Note that we always have a canonical
+  // loop here because we think we *can* vectorize.
   // FIXME: This is hidden behind a flag due to pervasive problems with
----------------
The original comment should in any case be updated to indicate that it's affecting the decision of optimizing-for-size cold loops.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7722
+    auto IsColdLoop = EstimatedTC ?
+      (*EstimatedTC < TinyTripCountVectorThreshold) :
+      (LoopEntryFreq < ColdEntryFreq);
----------------
If loop is known to iterate less than TinyTripCounterVectorThreshold, we avoid vectorizing it altogether, rather than vectorizing it with code-size constraints; unless vectorizing it is explicitly forced.

So should this
```
const unsigned MaxTC = SE->getSmallConstantMaxTripCount(L);
if (MaxTC > 0u && MaxTC < TinyTripCountVectorThreshold) {
  ...
}
```
be extended to use profiling where static analysis fails, e.g., by inserting the following between the top two lines above:
```
if (MaxTC == 0 && LoopVectorizeWithBlockFrequency) {
  auto EstimatedTC = getLoopEstimatedTripCount(L);
  if (EstimatedTC)
    MaxTC = *EstimatedTC;
}
```
?

OTOH, setting OptForSize to true when the trip count is unknown effectively prevents vectorization, because an epilog is needed.


================
Comment at: test/Transforms/LoopVectorize/tripcount.ll:2
+; This test verifies that the loop vectorizer will not vectorizes low trip count
+; loops that require runtime checks (Trip count is computed with profile info).
+; REQUIRES: asserts
----------------
(As argued above, we expect loop not to be vectorized, rather than optimized for size.)


================
Comment at: test/Transforms/LoopVectorize/tripcount.ll:15
+; CHECK-NOT: <2 x i8>
+; CHECK-NOT: <4 x i8>
+
----------------
Better check instead that no vector types are generated, regardless of their size.


https://reviews.llvm.org/D32451





More information about the llvm-commits mailing list