[PATCH] D25963: [LoopUnroll] Implement profile-based loop peeling

Adam Nemet via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 15 14:28:38 PST 2016


anemet added a comment.

> The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.), which is important given that we know that we will usually only hit this code, and not the actual loop.

The resulting code should also be more branch-predictor friendly; for small-trip count loops the loop-exiting misprediction can be significant.



================
Comment at: include/llvm/Transforms/Utils/UnrollLoop.h:19
 
+// Needed because we can't forward-declared the nested struct
+// TargetTransformInfo::UnrollingPreferences
----------------
forward-declare


================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:1181-1183
+  bool Changed = tryToUnrollLoop(&L, *DT, LI, SE, *TTI, *AC, *ORE,
+                                 /*PreserveLCSSA*/ true, ProvidedCount,
+                                 ProvidedThreshold, ProvidedAllowPartial,
----------------
If this is a formatting-only change please remove it from this patch.


================
Comment at: lib/Transforms/Utils/LoopUnroll.cpp:356
+                                 L->getHeader())
+              << " peeling loop %" << Header->getName() << " by "
+              << NV("PeelCount", PeelCount) << " iterations.\n");
----------------
We don't want to print internal IR names like this.  The remark will already point at the source location of the loop.


================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:63
+unsigned llvm::getPeelCount(Loop *L, unsigned LoopSize,
+                            TargetTransformInfo::UnrollingPreferences *UP) {
+  if (!canPeel(L))
----------------
mkuper wrote:
> davidxl wrote:
> > mkuper wrote:
> > > davidxl wrote:
> > > > Is UP parameter intended to be used?
> > > Sorry, I don't quite understand what you mean here. It is used (in line 84 or below). Could you explain?
> > Sorry I misread the early return as the regular return. Anyway, if UP is used here, there seems redundant to set UP.PeelCount in the caller of this function.
> I think it'd be better for it to be set explicitly in the caller, rather than hidden away in here. (This currently also sets it, but that was unintentional, thanks for noticing).
> I'd prefer to pass UP here by const reference, and set UP.PeelCount in the caller.
I agree with David that this interface is confusing.  This function either knows about UP or doesn't.  If it does it should handle both read-and-write.


================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:12
+// with dynamically inferred (from PGO) trip counts. See LoopUnroll.cpp for
+// unrolling loops with compile-time trip counts.
+//
----------------
compile-time constant trip count


================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:107-118
+// Clones the body of the loop L, putting it between InsertTop and InsertBot.
+// IterNumber is the serial number of the iteration currently being peeled off.
+// AvgIters is the average number of iterations we expect this loop to have.
+// PreHeader and Exit are the preheader and exit block of the original loop.
+// PeeledHeaderWeight is the number of *dynamic* loop entries still
+// unaccounted for - that is, it is the number of times we expect to enter
+// the header of the currently-peeled loop iteration.
----------------
All these function comments should be doxygen comments.


================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:111
+// PreHeader and Exit are the preheader and exit block of the original loop.
+// PeeledHeaderWeight is the number of *dynamic* loop entries still
+// unaccounted for - that is, it is the number of times we expect to enter
----------------
"loop entries" is a bit confusing here; we're not entering the loop.


================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:233-260
+  // Set up all the necessary basic blocks. It is convenient to split the
+  // preheader into 3 parts - two blocks to anchor the peeled copy of the loop
+  // body, and a new preheader for the "real" loop.
+
+  // Peeling the first iteration transforms.
+  //
+  // PreHeader:
----------------
I am not sure I don't understand this comment.  Perhaps having the pseudo code when one more iteration is peeled may clear things up.  It's unclear to me in this last paragraph for example how the conditional branch is generated in the split bottom anchor block.


================
Comment at: lib/Transforms/Utils/LoopUtils.cpp:1084-1093
+  // The branch weights give us almost what we want, since they were adjusted
+  // from the raw counts to provide a better probability estimate. Remove
+  // the adjustment by subtracting 1 from both weights.
+  uint64_t TrueVal, FalseVal;
+  if (!LatchBR->extractProfMetadata(TrueVal, FalseVal) || (TrueVal <= 1) ||
+      (FalseVal <= 1))
+    return 0;
----------------
Do we have a precedence adjusting weight like this back?   Hopefully the underlying issue will be fixed soon so we shouldn't add band-aids like this.


================
Comment at: test/Transforms/LoopUnroll/peel-loop.ll:1
+; RUN: opt < %s -S -loop-unroll -unroll-force-peel-count=3 -simplifycfg -instcombine | FileCheck %s
+
----------------
It would be good to explain what each of these tests.


https://reviews.llvm.org/D25963





More information about the llvm-commits mailing list