[PATCH] D25963: [LoopUnroll] Implement profile-based loop peeling
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 25 13:30:14 PDT 2016
mkuper created this revision.
mkuper added reviewers: mzolotukhin, haicheng, davidxl, danielcdh.
mkuper added a subscriber: llvm-commits.
Herald added subscribers: modocache, mgorny, beanz, sanjoy.
This implements profile-based loop peeling.
The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.), which is important given that we know that we will usually only hit this code, and not the actual loop.
The code is somewhat similar (and is based on the original version of) the runtime unrolling code, but I think like they're sufficiently different that trying to share the implementation isn't a good idea. Since the current runtime unrolling implementation already has two different prolog/epilog cases, making it do peeling as well will make it rather unreadable.
I'm planning on committing this as disabled-by-default, until I have a bit more confidence in the performance - some more tuning may be required.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 30533 bytes
Desc: not available
More information about the llvm-commits