[llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.

Tue Feb 10 13:24:55 PST 2015

On Mon Feb 09 2015 at 2:50:26 PM Michael Zolotukhin <mzolotukhin at apple.com>
wrote:

> Hi Eric,
>
> We indeed have a swarm of different options, and I’d be happy to simplify
> it. However, most of them (actually, all of them I think) are there for a
> reason, so trying to remove them would probably be painful.
>
>
Agreed.

> Let’s look at what we had there:
> * threshold for unrolling in presence of pragma
> * threshold for unrolling under -Os
> * threshold for unrolling in other cases
> * flag 'is partial unrolling allowed’
> * flag 'is runtime unrolling allowed’
>
>
These seem to make sense (and are dependent, effectively, on IR).

> Now I've added two more options:
> * ‘absolute’ threshold
> * percent of optimized for complete unroll
>
> I didn’t list an option 'unroll-max-iteration-count-to-analyze' here,
> because I don’t think anyone needs to tune it at all - it’s more for
> guarding the algorithm from doing too expensive analysis.
>
>
Might be nice to figure out how to move these into the thresholds above if
we think it's the right way of doing it?

-eric

> I do like an idea of simplifying this, but to me it looks that we’ll lose
> some cases if we just remove one of these thresholds - they cover very
> different areas. E.g. we can’t properly devise value for OptSize threshold
> from other thresholds. Similarly, it’s hard to get value for ‘absolute’
> threshold from ‘usual’ threshold - the latter deals more with tiny loops,
> while the former is for unrolling big loops, where we can get a lot from
> consequent constant-folding. We can choose ‘one-fits-all’ value for e.g.
> the percent and remove the corresponding field from TTI, but I doubt
> that’ll help much here (we’ll still have the parameter in our model, it’ll
> just become hidden).
>
> Having said that, I think that the names I chose for the new ones are not
> that good, and might be confusing (but I currently can’t come up with
> better ones). If you have a better idea for their names, or if you can
> suggest how can we simplify the overall scheme, I’d be happy to address it:)
>
> Thanks,
> Michael
>
>
> On Feb 9, 2015, at 1:02 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> Drive by review questions:
>
> 1) There are some uncomfortable tuning parameters here, can we figure out
> some better ideas using science?
> 2) There are also a huge number of tuning parameters for the pass and this
> is just adding more, what's up here?
>
> In other words the pass is starting to look like a sea of options + TTI
> hell. :)
>
> -eric
>
> On Mon Feb 09 2015 at 12:53:59 PM Michael Zolotukhin <
> mzolotukhin at apple.com> wrote:
>
>> Hi Hal,
>>
>> Could you please take a look at the attached test? Does it cover the new
>> features enough, or did I miss anything?
>>
>>
>>
>>
>> Thanks,
>> Michael
>>
>> > On Feb 6, 2015, at 12:31 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> >
>> > As you stated in your follow-up e-mail ;) -- this needs a test case.
>> >
>> > -Hal
>> >
>> > ----- Original Message -----
>> >> From: "Michael Zolotukhin" <mzolotukhin at apple.com>
>> >> To: llvm-commits at cs.uiuc.edu
>> >> Sent: Friday, February 6, 2015 2:20:40 PM
>> >> Subject: [llvm] r228434 - Use estimated number of optimized insns in
>> unroll-threshold computation.
>> >>
>> >> Author: mzolotukhin
>> >> Date: Fri Feb  6 14:20:40 2015
>> >> New Revision: 228434
>> >>
>> >> URL: http://llvm.org/viewvc/llvm-project?rev=228434&view=rev
>> >> Log:
>> >> Use estimated number of optimized insns in unroll-threshold
>> >> computation.
>> >>
>> >> If complete-unroll could help us to optimize away N% of instructions,
>> >> we
>> >> might want to do this even if the final size would exceed loop-unroll
>> >> threshold. However, we don't want to unroll huge loop, and we are add
>> >> AbsoluteThreshold to avoid that - this threshold will never be
>> >> crossed,
>> >> even if we expect to optimize 99% instructions after that.
>> >>
>> >> Modified:
>> >>    llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>> >>    llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
>> >>
>> >> Modified: llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h
>> >> URL:
>> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/
>> TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff
>> >> ============================================================
>> ==================
>> >> --- llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h (original)
>> >> +++ llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h Fri Feb  6
>> >> 14:20:40 2015
>> >> @@ -217,6 +217,13 @@ public:
>> >>     /// exceed this cost. Set this to UINT_MAX to disable the loop
>> >>     body cost
>> >>     /// restriction.
>> >>     unsigned Threshold;
>> >> +    /// If complete unrolling could help other optimizations (e.g.
>> >> InstSimplify)
>> >> +    /// to remove N% of instructions, then we can go beyond unroll
>> >> threshold.
>> >> +    /// This value set the minimal percent for allowing that.
>> >> +    unsigned MinPercentOfOptimized;
>> >> +    /// The absolute cost threshold. We won't go beyond this even if
>> >> complete
>> >> +    /// unrolling could result in optimizing out 90% of
>> >> instructions.
>> >> +    unsigned AbsoluteThreshold;
>> >>     /// The cost threshold for the unrolled loop when optimizing for
>> >>     size (set
>> >>     /// to UINT_MAX to disable).
>> >>     unsigned OptSizeThreshold;
>> >>
>> >> Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
>> >> URL:
>> >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/
>> LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff
>> >> ============================================================
>> ==================
>> >> --- llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp (original)
>> >> +++ llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp Fri Feb  6
>> >> 14:20:40 2015
>> >> @@ -45,6 +45,17 @@ static cl::opt<unsigned> UnrollMaxIterat
>> >>     cl::desc("Don't allow loop unrolling to simulate more than this
>> >>     number of"
>> >>              "iterations when checking full unroll profitability"));
>> >>
>> >> +static cl::opt<unsigned> UnrollMinPercentOfOptimized(
>> >> +    "unroll-percent-of-optimized-for-complete-unroll", cl::init(20),
>> >> cl::Hidden,
>> >> +    cl::desc("If complete unrolling could trigger further
>> >> optimizations, and, "
>> >> +             "by that, remove the given percent of instructions,
>> >> perform the "
>> >> +             "complete unroll even if it's beyond the threshold"));
>> >> +
>> >> +static cl::opt<unsigned> UnrollAbsoluteThreshold(
>> >> +    "unroll-absolute-threshold", cl::init(2000), cl::Hidden,
>> >> +    cl::desc("Don't unroll if the unrolled size is bigger than this
>> >> threshold,"
>> >> +             " even if we can remove big portion of instructions
>> >> later."));
>> >> +
>> >> static cl::opt<unsigned>
>> >> UnrollCount("unroll-count", cl::init(0), cl::Hidden,
>> >>   cl::desc("Use this unroll count for all loops including those with
>> >>   "
>> >> @@ -70,11 +81,16 @@ namespace {
>> >>     static char ID; // Pass ID, replacement for typeid
>> >>     LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) :
>> >>     LoopPass(ID) {
>> >>       CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);
>> >> +      CurrentAbsoluteThreshold = UnrollAbsoluteThreshold;
>> >> +      CurrentMinPercentOfOptimized = UnrollMinPercentOfOptimized;
>> >>       CurrentCount = (C == -1) ? UnrollCount : unsigned(C);
>> >>       CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
>> >>       (bool)P;
>> >>       CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;
>> >>
>> >>       UserThreshold = (T != -1) ||
>> >>       (UnrollThreshold.getNumOccurrences() > 0);
>> >> +      UserAbsoluteThreshold =
>> >> (UnrollAbsoluteThreshold.getNumOccurrences() > 0);
>> >> +      UserPercentOfOptimized =
>> >> +          (UnrollMinPercentOfOptimized.getNumOccurrences() > 0);
>> >>       UserAllowPartial = (P != -1) ||
>> >>                          (UnrollAllowPartial.getNumOccurrences() >
>> >>                          0);
>> >>       UserRuntime = (R != -1) || (UnrollRuntime.getNumOccurrences()
>> >>> 0);
>> >> @@ -98,10 +114,16 @@ namespace {
>> >>
>> >>     unsigned CurrentCount;
>> >>     unsigned CurrentThreshold;
>> >> +    unsigned CurrentAbsoluteThreshold;
>> >> +    unsigned CurrentMinPercentOfOptimized;
>> >>     bool     CurrentAllowPartial;
>> >>     bool     CurrentRuntime;
>> >>     bool     UserCount;            // CurrentCount is
>> >>     user-specified.
>> >>     bool     UserThreshold;        // CurrentThreshold is
>> >>     user-specified.
>> >> +    bool UserAbsoluteThreshold;    // CurrentAbsoluteThreshold is
>> >> +                                   // user-specified.
>> >> +    bool UserPercentOfOptimized;   // CurrentMinPercentOfOptimized
>> >> is
>> >> +                                   // user-specified.
>> >>     bool     UserAllowPartial;     // CurrentAllowPartial is
>> >>     user-specified.
>> >>     bool     UserRuntime;          // CurrentRuntime is
>> >>     user-specified.
>> >>
>> >> @@ -133,6 +155,8 @@ namespace {
>> >>     void getUnrollingPreferences(Loop *L, const TargetTransformInfo
>> >>     &TTI,
>> >>                                  TargetTransformInfo::Unrolling
>> Preferences
>> >>                                  &UP) {
>> >>       UP.Threshold = CurrentThreshold;
>> >> +      UP.AbsoluteThreshold = CurrentAbsoluteThreshold;
>> >> +      UP.MinPercentOfOptimized = CurrentMinPercentOfOptimized;
>> >>       UP.OptSizeThreshold = OptSizeUnrollThreshold;
>> >>       UP.PartialThreshold = CurrentThreshold;
>> >>       UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;
>> >> @@ -160,13 +184,32 @@ namespace {
>> >>     void selectThresholds(const Loop *L, bool HasPragma,
>> >>                           const
>> >>                           TargetTransformInfo::UnrollingPreferences
>> >>                           &UP,
>> >>                           unsigned &Threshold, unsigned
>> >>                           &PartialThreshold,
>> >> -                          unsigned NumberOfSimplifiedInstructions) {
>> >> +                          unsigned NumberOfOptimizedInstructions) {
>> >>       // Determine the current unrolling threshold.  While this is
>> >>       // normally set from UnrollThreshold, it is overridden to a
>> >>       // smaller value if the current function is marked as
>> >>       // optimize-for-size, and the unroll threshold was not user
>> >>       // specified.
>> >>       Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;
>> >> +
>> >> +      // If we are allowed to completely unroll if we can remove M%
>> >> of
>> >> +      // instructions, and we know that with complete unrolling
>> >> we'll be able
>> >> +      // to kill N instructions, then we can afford to completely
>> >> unroll loops
>> >> +      // with unrolled size up to N*100/M.
>> >> +      // Adjust the threshold according to that:
>> >> +      unsigned PercentOfOptimizedForCompleteUnroll =
>> >> +          UserPercentOfOptimized ? CurrentMinPercentOfOptimized
>> >> +                                 : UP.MinPercentOfOptimized;
>> >> +      unsigned AbsoluteThreshold = UserAbsoluteThreshold
>> >> +                                       ? CurrentAbsoluteThreshold
>> >> +                                       : UP.AbsoluteThreshold;
>> >> +      if (PercentOfOptimizedForCompleteUnroll)
>> >> +        Threshold = std::max<unsigned>(Threshold,
>> >> +                                       NumberOfOptimizedInstructions
>> >> * 100 /
>> >> +
>> >>                                          PercentOfOptimizedForCompleteU
>> nroll);
>> >> +      // But don't allow unrolling loops bigger than absolute
>> >> threshold.
>> >> +      Threshold = std::min<unsigned>(Threshold, AbsoluteThreshold);
>> >> +
>> >>       PartialThreshold = UserThreshold ? CurrentThreshold :
>> >>       UP.PartialThreshold;
>> >>       if (!UserThreshold &&
>> >>           L->getHeader()->getParent()->getAttributes().
>> >> @@ -186,7 +229,6 @@ namespace {
>> >>           PartialThreshold =
>> >>               std::max<unsigned>(PartialThreshold,
>> >>               PragmaUnrollThreshold);
>> >>       }
>> >> -      Threshold += NumberOfSimplifiedInstructions;
>> >>     }
>> >>   };
>> >> }
>> >>
>> >>
>> >> _______________________________________________
>> >> llvm-commits mailing list
>> >> llvm-commits at cs.uiuc.edu
>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> >>
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150210/9c33f483/attachment.html>