[PATCH] D30243: [LoopUnrolling] Re-prioritize Peeling and Partial unrolling

Wed Feb 22 10:22:49 PST 2017

mkuper added inline comments.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:789
+  computePeelCount(L, LoopSize, UP);
+  if (UP.PeelCount) {
+    UP.Runtime = false;
----------------
mkazantsev wrote:
> mkuper wrote:
> > The current heuristic is supposed to be that we only peel when we don't know the trip count, but have an approximate one due to profile information. Partial unrolling requires a known trip count. So in the old code, they don't actually compete for priority, and this change looks like it should be NFC.
> > 
> > I'm not sure this is completely NFC, though - I don't remember if computePeelCount actually enforces an unknown trip count, or just assumes it's unknown, because otherwise we'd quit at line 838. So you probably want this to be "if (!TripCount && UP.PeelCount)", or change computePeelCount() to bail on a known trip count. Otherwise, what you may in effect be doing is enabling peeling for loops with a constant low trip count that we chose not to unroll. Which is probably not something you want to do.
> In fact, even if trip count is known, peeling can still be profitable. One of such cases is shown at https://reviews.llvm.org/D30161, where Phi's back edge input is a constant. Partial unrolling here does not give any benefits, while peeling does. My understanding is following: Peeling should not set peel count unless it has proved that peeling IS profitable. In this case nothing bad happens with small loops that we don't unroll: they will only be peeled if it is good. Maybe it is reasonable to add the following restriction to peeling: TripCount should be either unknown or above some threshold (for example, partial unrolling threshold). Does it make sense for you?
Sorry, I wasn't clear.
I understand that there are cases where peeling is profitable even when the trip count is known - D30161 is a really nice example.

But right now, when we have profile information, we try to peel by the estimated number of iterations. That specific kind of peeling, combined with this patch,  has two issues:

1) getLoopEstimatedTripCount() returns the estimated (based on the profile) trip count. This may be imprecise, and at this point, somewhat different from the actual trip count, especially for sampling-based FDO.  So you may end up peeling by the "estimated" trip-count even though you know the actual tripcount. That sounds wrong.

2) Even if we fix getLoopEstimatedTripCount() to return the real trip count, when available, you still basically get "let's peel a loop by its real tripcount", which is equivalent to full unrolling. In theory, it shouldn't happen, because full unrolling and peeling should be using the same thresholds, so if we decided not to fully unroll, we'll decide not to "peel" either. But I'm not sure this actually holds.

I'm not saying we should disable any kind of peeling when the real trip count is known. Only that we should disable the "peel by the estimated trip count" heuristic.

https://reviews.llvm.org/D30243