[llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.

Michael Zolotukhin mzolotukhin at apple.com
Mon Feb 9 14:50:24 PST 2015


Hi Eric,

We indeed have a swarm of different options, and I’d be happy to simplify it. However, most of them (actually, all of them I think) are there for a reason, so trying to remove them would probably be painful.

Let’s look at what we had there:
* threshold for unrolling in presence of pragma
* threshold for unrolling under -Os
* threshold for unrolling in other cases
* flag 'is partial unrolling allowed’
* flag 'is runtime unrolling allowed’

Now I've added two more options:
* ‘absolute’ threshold
* percent of optimized for complete unroll

I didn’t list an option 'unroll-max-iteration-count-to-analyze' here, because I don’t think anyone needs to tune it at all - it’s more for guarding the algorithm from doing too expensive analysis.

I do like an idea of simplifying this, but to me it looks that we’ll lose some cases if we just remove one of these thresholds - they cover very different areas. E.g. we can’t properly devise value for OptSize threshold from other thresholds. Similarly, it’s hard to get value for ‘absolute’ threshold from ‘usual’ threshold - the latter deals more with tiny loops, while the former is for unrolling big loops, where we can get a lot from consequent constant-folding. We can choose ‘one-fits-all’ value for e.g. the percent and remove the corresponding field from TTI, but I doubt that’ll help much here (we’ll still have the parameter in our model, it’ll just become hidden).

Having said that, I think that the names I chose for the new ones are not that good, and might be confusing (but I currently can’t come up with better ones). If you have a better idea for their names, or if you can suggest how can we simplify the overall scheme, I’d be happy to address it:)

Thanks,
Michael


> On Feb 9, 2015, at 1:02 PM, Eric Christopher <echristo at gmail.com> wrote:
> 
> Drive by review questions:
> 
> 1) There are some uncomfortable tuning parameters here, can we figure out some better ideas using science?
> 2) There are also a huge number of tuning parameters for the pass and this is just adding more, what's up here?
> 
> In other words the pass is starting to look like a sea of options + TTI hell. :)
> 
> -eric
> 
> On Mon Feb 09 2015 at 12:53:59 PM Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
> Hi Hal,
> 
> Could you please take a look at the attached test? Does it cover the new features enough, or did I miss anything?
> 
> 
> 
> 
> Thanks,
> Michael
> 
> > On Feb 6, 2015, at 12:31 PM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
> >
> > As you stated in your follow-up e-mail ;) -- this needs a test case.
> >
> > -Hal
> >
> > ----- Original Message -----
> >> From: "Michael Zolotukhin" <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>>
> >> To: llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> >> Sent: Friday, February 6, 2015 2:20:40 PM
> >> Subject: [llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.
> >>
> >> Author: mzolotukhin
> >> Date: Fri Feb  6 14:20:40 2015
> >> New Revision: 228434
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=228434&view=rev <http://llvm.org/viewvc/llvm-project?rev=228434&view=rev>
> >> Log:
> >> Use estimated number of optimized insns in unroll-threshold
> >> computation.
> >>
> >> If complete-unroll could help us to optimize away N% of instructions,
> >> we
> >> might want to do this even if the final size would exceed loop-unroll
> >> threshold. However, we don't want to unroll huge loop, and we are add
> >> AbsoluteThreshold to avoid that - this threshold will never be
> >> crossed,
> >> even if we expect to optimize 99% instructions after that.
> >>
> >> Modified:
> >>    llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >>    llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
> >>
> >> Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff>
> >> ==============================================================================
> >> --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
> >> +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri Feb  6
> >> 14:20:40 2015
> >> @@ -217,6 +217,13 @@ public:
> >>     /// exceed this cost. Set this to UINT_MAX to disable the loop
> >>     body cost
> >>     /// restriction.
> >>     unsigned Threshold;
> >> +    /// If complete unrolling could help other optimizations (e.g.
> >> InstSimplify)
> >> +    /// to remove N% of instructions, then we can go beyond unroll
> >> threshold.
> >> +    /// This value set the minimal percent for allowing that.
> >> +    unsigned MinPercentOfOptimized;
> >> +    /// The absolute cost threshold. We won't go beyond this even if
> >> complete
> >> +    /// unrolling could result in optimizing out 90% of
> >> instructions.
> >> +    unsigned AbsoluteThreshold;
> >>     /// The cost threshold for the unrolled loop when optimizing for
> >>     size (set
> >>     /// to UINT_MAX to disable).
> >>     unsigned OptSizeThreshold;
> >>
> >> Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff>
> >> ==============================================================================
> >> --- llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp (original)
> >> +++ llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp Fri Feb  6
> >> 14:20:40 2015
> >> @@ -45,6 +45,17 @@ static cl::opt<unsigned> UnrollMaxIterat
> >>     cl::desc("Don't allow loop unrolling to simulate more than this
> >>     number of"
> >>              "iterations when checking full unroll profitability"));
> >>
> >> +static cl::opt<unsigned> UnrollMinPercentOfOptimized(
> >> +    "unroll-percent-of-optimized-for-complete-unroll", cl::init(20),
> >> cl::Hidden,
> >> +    cl::desc("If complete unrolling could trigger further
> >> optimizations, and, "
> >> +             "by that, remove the given percent of instructions,
> >> perform the "
> >> +             "complete unroll even if it's beyond the threshold"));
> >> +
> >> +static cl::opt<unsigned> UnrollAbsoluteThreshold(
> >> +    "unroll-absolute-threshold", cl::init(2000), cl::Hidden,
> >> +    cl::desc("Don't unroll if the unrolled size is bigger than this
> >> threshold,"
> >> +             " even if we can remove big portion of instructions
> >> later."));
> >> +
> >> static cl::opt<unsigned>
> >> UnrollCount("unroll-count", cl::init(0), cl::Hidden,
> >>   cl::desc("Use this unroll count for all loops including those with
> >>   "
> >> @@ -70,11 +81,16 @@ namespace {
> >>     static char ID; // Pass ID, replacement for typeid
> >>     LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) :
> >>     LoopPass(ID) {
> >>       CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);
> >> +      CurrentAbsoluteThreshold = UnrollAbsoluteThreshold;
> >> +      CurrentMinPercentOfOptimized = UnrollMinPercentOfOptimized;
> >>       CurrentCount = (C == -1) ? UnrollCount : unsigned(C);
> >>       CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
> >>       (bool)P;
> >>       CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;
> >>
> >>       UserThreshold = (T != -1) ||
> >>       (UnrollThreshold.getNumOccurrences() > 0);
> >> +      UserAbsoluteThreshold =
> >> (UnrollAbsoluteThreshold.getNumOccurrences() > 0);
> >> +      UserPercentOfOptimized =
> >> +          (UnrollMinPercentOfOptimized.getNumOccurrences() > 0);
> >>       UserAllowPartial = (P != -1) ||
> >>                          (UnrollAllowPartial.getNumOccurrences() >
> >>                          0);
> >>       UserRuntime = (R != -1) || (UnrollRuntime.getNumOccurrences()
> >>> 0);
> >> @@ -98,10 +114,16 @@ namespace {
> >>
> >>     unsigned CurrentCount;
> >>     unsigned CurrentThreshold;
> >> +    unsigned CurrentAbsoluteThreshold;
> >> +    unsigned CurrentMinPercentOfOptimized;
> >>     bool     CurrentAllowPartial;
> >>     bool     CurrentRuntime;
> >>     bool     UserCount;            // CurrentCount is
> >>     user-specified.
> >>     bool     UserThreshold;        // CurrentThreshold is
> >>     user-specified.
> >> +    bool UserAbsoluteThreshold;    // CurrentAbsoluteThreshold is
> >> +                                   // user-specified.
> >> +    bool UserPercentOfOptimized;   // CurrentMinPercentOfOptimized
> >> is
> >> +                                   // user-specified.
> >>     bool     UserAllowPartial;     // CurrentAllowPartial is
> >>     user-specified.
> >>     bool     UserRuntime;          // CurrentRuntime is
> >>     user-specified.
> >>
> >> @@ -133,6 +155,8 @@ namespace {
> >>     void getUnrollingPreferences(Loop *L, const TargetTransformInfo
> >>     &TTI,
> >>                                  TargetTransformInfo::UnrollingPreferences
> >>                                  &UP) {
> >>       UP.Threshold = CurrentThreshold;
> >> +      UP.AbsoluteThreshold = CurrentAbsoluteThreshold;
> >> +      UP.MinPercentOfOptimized = CurrentMinPercentOfOptimized;
> >>       UP.OptSizeThreshold = OptSizeUnrollThreshold;
> >>       UP.PartialThreshold = CurrentThreshold;
> >>       UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;
> >> @@ -160,13 +184,32 @@ namespace {
> >>     void selectThresholds(const Loop *L, bool HasPragma,
> >>                           const
> >>                           TargetTransformInfo::UnrollingPreferences
> >>                           &UP,
> >>                           unsigned &Threshold, unsigned
> >>                           &PartialThreshold,
> >> -                          unsigned NumberOfSimplifiedInstructions) {
> >> +                          unsigned NumberOfOptimizedInstructions) {
> >>       // Determine the current unrolling threshold.  While this is
> >>       // normally set from UnrollThreshold, it is overridden to a
> >>       // smaller value if the current function is marked as
> >>       // optimize-for-size, and the unroll threshold was not user
> >>       // specified.
> >>       Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;
> >> +
> >> +      // If we are allowed to completely unroll if we can remove M%
> >> of
> >> +      // instructions, and we know that with complete unrolling
> >> we'll be able
> >> +      // to kill N instructions, then we can afford to completely
> >> unroll loops
> >> +      // with unrolled size up to N*100/M.
> >> +      // Adjust the threshold according to that:
> >> +      unsigned PercentOfOptimizedForCompleteUnroll =
> >> +          UserPercentOfOptimized ? CurrentMinPercentOfOptimized
> >> +                                 : UP.MinPercentOfOptimized;
> >> +      unsigned AbsoluteThreshold = UserAbsoluteThreshold
> >> +                                       ? CurrentAbsoluteThreshold
> >> +                                       : UP.AbsoluteThreshold;
> >> +      if (PercentOfOptimizedForCompleteUnroll)
> >> +        Threshold = std::max<unsigned>(Threshold,
> >> +                                       NumberOfOptimizedInstructions
> >> * 100 /
> >> +
> >>                                          PercentOfOptimizedForCompleteUnroll);
> >> +      // But don't allow unrolling loops bigger than absolute
> >> threshold.
> >> +      Threshold = std::min<unsigned>(Threshold, AbsoluteThreshold);
> >> +
> >>       PartialThreshold = UserThreshold ? CurrentThreshold :
> >>       UP.PartialThreshold;
> >>       if (!UserThreshold &&
> >>           L->getHeader()->getParent()->getAttributes().
> >> @@ -186,7 +229,6 @@ namespace {
> >>           PartialThreshold =
> >>               std::max<unsigned>(PartialThreshold,
> >>               PragmaUnrollThreshold);
> >>       }
> >> -      Threshold += NumberOfSimplifiedInstructions;
> >>     }
> >>   };
> >> }
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150209/7f9f2710/attachment.html>


More information about the llvm-commits mailing list