[llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.
Michael Zolotukhin
mzolotukhin at apple.com
Mon Feb 9 14:50:24 PST 2015
Hi Eric,
We indeed have a swarm of different options, and I’d be happy to simplify it. However, most of them (actually, all of them I think) are there for a reason, so trying to remove them would probably be painful.
Let’s look at what we had there:
* threshold for unrolling in presence of pragma
* threshold for unrolling under -Os
* threshold for unrolling in other cases
* flag 'is partial unrolling allowed’
* flag 'is runtime unrolling allowed’
Now I've added two more options:
* ‘absolute’ threshold
* percent of optimized for complete unroll
I didn’t list an option 'unroll-max-iteration-count-to-analyze' here, because I don’t think anyone needs to tune it at all - it’s more for guarding the algorithm from doing too expensive analysis.
I do like an idea of simplifying this, but to me it looks that we’ll lose some cases if we just remove one of these thresholds - they cover very different areas. E.g. we can’t properly devise value for OptSize threshold from other thresholds. Similarly, it’s hard to get value for ‘absolute’ threshold from ‘usual’ threshold - the latter deals more with tiny loops, while the former is for unrolling big loops, where we can get a lot from consequent constant-folding. We can choose ‘one-fits-all’ value for e.g. the percent and remove the corresponding field from TTI, but I doubt that’ll help much here (we’ll still have the parameter in our model, it’ll just become hidden).
Having said that, I think that the names I chose for the new ones are not that good, and might be confusing (but I currently can’t come up with better ones). If you have a better idea for their names, or if you can suggest how can we simplify the overall scheme, I’d be happy to address it:)
Thanks,
Michael
> On Feb 9, 2015, at 1:02 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> Drive by review questions:
>
> 1) There are some uncomfortable tuning parameters here, can we figure out some better ideas using science?
> 2) There are also a huge number of tuning parameters for the pass and this is just adding more, what's up here?
>
> In other words the pass is starting to look like a sea of options + TTI hell. :)
>
> -eric
>
> On Mon Feb 09 2015 at 12:53:59 PM Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
> Hi Hal,
>
> Could you please take a look at the attached test? Does it cover the new features enough, or did I miss anything?
>
>
>
>
> Thanks,
> Michael
>
> > On Feb 6, 2015, at 12:31 PM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
> >
> > As you stated in your follow-up e-mail ;) -- this needs a test case.
> >
> > -Hal
> >
> > ----- Original Message -----
> >> From: "Michael Zolotukhin" <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>>
> >> To: llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> >> Sent: Friday, February 6, 2015 2:20:40 PM
> >> Subject: [llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.
> >>
> >> Author: mzolotukhin
> >> Date: Fri Feb 6 14:20:40 2015
> >> New Revision: 228434
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=228434&view=rev <http://llvm.org/viewvc/llvm-project?rev=228434&view=rev>
> >> Log:
> >> Use estimated number of optimized insns in unroll-threshold
> >> computation.
> >>
> >> If complete-unroll could help us to optimize away N% of instructions,
> >> we
> >> might want to do this even if the final size would exceed loop-unroll
> >> threshold. However, we don't want to unroll huge loop, and we are add
> >> AbsoluteThreshold to avoid that - this threshold will never be
> >> crossed,
> >> even if we expect to optimize 99% instructions after that.
> >>
> >> Modified:
> >> llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
> >>
> >> Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff>
> >> ==============================================================================
> >> --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
> >> +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri Feb 6
> >> 14:20:40 2015
> >> @@ -217,6 +217,13 @@ public:
> >> /// exceed this cost. Set this to UINT_MAX to disable the loop
> >> body cost
> >> /// restriction.
> >> unsigned Threshold;
> >> + /// If complete unrolling could help other optimizations (e.g.
> >> InstSimplify)
> >> + /// to remove N% of instructions, then we can go beyond unroll
> >> threshold.
> >> + /// This value set the minimal percent for allowing that.
> >> + unsigned MinPercentOfOptimized;
> >> + /// The absolute cost threshold. We won't go beyond this even if
> >> complete
> >> + /// unrolling could result in optimizing out 90% of
> >> instructions.
> >> + unsigned AbsoluteThreshold;
> >> /// The cost threshold for the unrolled loop when optimizing for
> >> size (set
> >> /// to UINT_MAX to disable).
> >> unsigned OptSizeThreshold;
> >>
> >> Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff>
> >> ==============================================================================
> >> --- llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp (original)
> >> +++ llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp Fri Feb 6
> >> 14:20:40 2015
> >> @@ -45,6 +45,17 @@ static cl::opt<unsigned> UnrollMaxIterat
> >> cl::desc("Don't allow loop unrolling to simulate more than this
> >> number of"
> >> "iterations when checking full unroll profitability"));
> >>
> >> +static cl::opt<unsigned> UnrollMinPercentOfOptimized(
> >> + "unroll-percent-of-optimized-for-complete-unroll", cl::init(20),
> >> cl::Hidden,
> >> + cl::desc("If complete unrolling could trigger further
> >> optimizations, and, "
> >> + "by that, remove the given percent of instructions,
> >> perform the "
> >> + "complete unroll even if it's beyond the threshold"));
> >> +
> >> +static cl::opt<unsigned> UnrollAbsoluteThreshold(
> >> + "unroll-absolute-threshold", cl::init(2000), cl::Hidden,
> >> + cl::desc("Don't unroll if the unrolled size is bigger than this
> >> threshold,"
> >> + " even if we can remove big portion of instructions
> >> later."));
> >> +
> >> static cl::opt<unsigned>
> >> UnrollCount("unroll-count", cl::init(0), cl::Hidden,
> >> cl::desc("Use this unroll count for all loops including those with
> >> "
> >> @@ -70,11 +81,16 @@ namespace {
> >> static char ID; // Pass ID, replacement for typeid
> >> LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) :
> >> LoopPass(ID) {
> >> CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);
> >> + CurrentAbsoluteThreshold = UnrollAbsoluteThreshold;
> >> + CurrentMinPercentOfOptimized = UnrollMinPercentOfOptimized;
> >> CurrentCount = (C == -1) ? UnrollCount : unsigned(C);
> >> CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
> >> (bool)P;
> >> CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;
> >>
> >> UserThreshold = (T != -1) ||
> >> (UnrollThreshold.getNumOccurrences() > 0);
> >> + UserAbsoluteThreshold =
> >> (UnrollAbsoluteThreshold.getNumOccurrences() > 0);
> >> + UserPercentOfOptimized =
> >> + (UnrollMinPercentOfOptimized.getNumOccurrences() > 0);
> >> UserAllowPartial = (P != -1) ||
> >> (UnrollAllowPartial.getNumOccurrences() >
> >> 0);
> >> UserRuntime = (R != -1) || (UnrollRuntime.getNumOccurrences()
> >>> 0);
> >> @@ -98,10 +114,16 @@ namespace {
> >>
> >> unsigned CurrentCount;
> >> unsigned CurrentThreshold;
> >> + unsigned CurrentAbsoluteThreshold;
> >> + unsigned CurrentMinPercentOfOptimized;
> >> bool CurrentAllowPartial;
> >> bool CurrentRuntime;
> >> bool UserCount; // CurrentCount is
> >> user-specified.
> >> bool UserThreshold; // CurrentThreshold is
> >> user-specified.
> >> + bool UserAbsoluteThreshold; // CurrentAbsoluteThreshold is
> >> + // user-specified.
> >> + bool UserPercentOfOptimized; // CurrentMinPercentOfOptimized
> >> is
> >> + // user-specified.
> >> bool UserAllowPartial; // CurrentAllowPartial is
> >> user-specified.
> >> bool UserRuntime; // CurrentRuntime is
> >> user-specified.
> >>
> >> @@ -133,6 +155,8 @@ namespace {
> >> void getUnrollingPreferences(Loop *L, const TargetTransformInfo
> >> &TTI,
> >> TargetTransformInfo::UnrollingPreferences
> >> &UP) {
> >> UP.Threshold = CurrentThreshold;
> >> + UP.AbsoluteThreshold = CurrentAbsoluteThreshold;
> >> + UP.MinPercentOfOptimized = CurrentMinPercentOfOptimized;
> >> UP.OptSizeThreshold = OptSizeUnrollThreshold;
> >> UP.PartialThreshold = CurrentThreshold;
> >> UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;
> >> @@ -160,13 +184,32 @@ namespace {
> >> void selectThresholds(const Loop *L, bool HasPragma,
> >> const
> >> TargetTransformInfo::UnrollingPreferences
> >> &UP,
> >> unsigned &Threshold, unsigned
> >> &PartialThreshold,
> >> - unsigned NumberOfSimplifiedInstructions) {
> >> + unsigned NumberOfOptimizedInstructions) {
> >> // Determine the current unrolling threshold. While this is
> >> // normally set from UnrollThreshold, it is overridden to a
> >> // smaller value if the current function is marked as
> >> // optimize-for-size, and the unroll threshold was not user
> >> // specified.
> >> Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;
> >> +
> >> + // If we are allowed to completely unroll if we can remove M%
> >> of
> >> + // instructions, and we know that with complete unrolling
> >> we'll be able
> >> + // to kill N instructions, then we can afford to completely
> >> unroll loops
> >> + // with unrolled size up to N*100/M.
> >> + // Adjust the threshold according to that:
> >> + unsigned PercentOfOptimizedForCompleteUnroll =
> >> + UserPercentOfOptimized ? CurrentMinPercentOfOptimized
> >> + : UP.MinPercentOfOptimized;
> >> + unsigned AbsoluteThreshold = UserAbsoluteThreshold
> >> + ? CurrentAbsoluteThreshold
> >> + : UP.AbsoluteThreshold;
> >> + if (PercentOfOptimizedForCompleteUnroll)
> >> + Threshold = std::max<unsigned>(Threshold,
> >> + NumberOfOptimizedInstructions
> >> * 100 /
> >> +
> >> PercentOfOptimizedForCompleteUnroll);
> >> + // But don't allow unrolling loops bigger than absolute
> >> threshold.
> >> + Threshold = std::min<unsigned>(Threshold, AbsoluteThreshold);
> >> +
> >> PartialThreshold = UserThreshold ? CurrentThreshold :
> >> UP.PartialThreshold;
> >> if (!UserThreshold &&
> >> L->getHeader()->getParent()->getAttributes().
> >> @@ -186,7 +229,6 @@ namespace {
> >> PartialThreshold =
> >> std::max<unsigned>(PartialThreshold,
> >> PragmaUnrollThreshold);
> >> }
> >> - Threshold += NumberOfSimplifiedInstructions;
> >> }
> >> };
> >> }
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150209/7f9f2710/attachment.html>
More information about the llvm-commits
mailing list