[PATCH] D28368: Give higher full-unroll boosting when the loop iteration is small.

Thu Jan 5 10:55:42 PST 2017

danielcdh created this revision.
danielcdh added reviewers: mzolotukhin, davidxl, mkuper.
danielcdh added a subscriber: llvm-commits.

The default threshold for fully unroll is too conservative, even with linear boosting factor. This patch boosts more full-unroll threshold to favor the cases when loop iterations are small.

This change will affect the following speccpu2006 benchmarks:

Code size:
447.dealII  0.50%
453.povray  0.42%
433.milc    0.20%
445.gobmk   0.32%
403.gcc     0.05%
464.h264ref 3.62%

Performance (on intel sandybridge):
447.dealII  +0.07%
453.povray  +1.79%
433.milc    +1.02%
445.gobmk   +0.56%
403.gcc     -0.16%
464.h264ref -0.41%

It also has positive impacts on several google internal benchmarks.


https://reviews.llvm.org/D28368

Files:
  lib/Transforms/Scalar/LoopUnrollPass.cpp
  test/Transforms/LoopUnroll/full-unroll-heuristics.ll


Index: test/Transforms/LoopUnroll/full-unroll-heuristics.ll
===================================================================

--- test/Transforms/LoopUnroll/full-unroll-heuristics.ll
+++ test/Transforms/LoopUnroll/full-unroll-heuristics.ll
@@ -18,7 +18,7 @@
 ; and unrolled size is 65.
 
 ; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=10 -unroll-max-percent-threshold-boost=100 | FileCheck %s -check-prefix=TEST1
-; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=20 -unroll-max-percent-threshold-boost=200 | FileCheck %s -check-prefix=TEST2
+; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=12 -unroll-max-percent-threshold-boost=400 | FileCheck %s -check-prefix=TEST2
 ; RUN: opt < %s -S -loop-unroll -unroll-max-iteration-count-to-analyze=1000 -unroll-threshold=20 -unroll-max-percent-threshold-boost=100 | FileCheck %s -check-prefix=TEST3
 
 ; If the absolute threshold is too low, we should not unroll:
Index: lib/Transforms/Scalar/LoopUnrollPass.cpp
===================================================================
--- lib/Transforms/Scalar/LoopUnrollPass.cpp
+++ lib/Transforms/Scalar/LoopUnrollPass.cpp
@@ -661,13 +661,20 @@
 // be beneficial to fully unroll the loop even if unrolledcost is large. We
 // use (RolledDynamicCost / UnrolledCost) to model the unroll benefits to adjust
 // the unroll threshold.
+// Another side-benefit of fully unroll is to remove all branches and put all
+// instructions into a single basic block, which expands optimization windows.
+// To model this, we set larger threshold when the loop trip count is small
+// because the relative code size increase (comparing with original loop) is
+// small when trip count is small.
 static unsigned getFullUnrollBoostingFactor(const EstimatedUnrollCost &Cost,
+                                            unsigned TripCount,
                                             unsigned MaxPercentThresholdBoost) {
+  assert(TripCount != 0 && "TripCount should not be 0.");
   if (Cost.RolledDynamicCost >= UINT_MAX / 100)
     return 100;
   else if (Cost.UnrolledCost != 0)
     // The boosting factor is RolledDynamicCost / UnrolledCost
-    return std::min(100 * Cost.RolledDynamicCost / Cost.UnrolledCost,
+    return std::min(100 * Cost.RolledDynamicCost / Cost.UnrolledCost + 400 / TripCount,
                     MaxPercentThresholdBoost);
   else
     return MaxPercentThresholdBoost;
@@ -759,7 +766,7 @@
               L, FullUnrollTripCount, DT, *SE, TTI,
               UP.Threshold * UP.MaxPercentThresholdBoost / 100)) {
         unsigned Boost =
-            getFullUnrollBoostingFactor(*Cost, UP.MaxPercentThresholdBoost);
+            getFullUnrollBoostingFactor(*Cost, UP.Count, UP.MaxPercentThresholdBoost);
         if (Cost->UnrolledCost < UP.Threshold * Boost / 100) {
           UseUpperBound = (MaxTripCount == FullUnrollTripCount);
           TripCount = FullUnrollTripCount;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D28368.83274.patch
Type: text/x-patch
Size: 3021 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170105/bf2aad6b/attachment.bin>