<div dir="ltr"><br><br><div class="gmail_quote">On Mon Feb 09 2015 at 2:50:26 PM Michael Zolotukhin <<a href="mailto:mzolotukhin@apple.com">mzolotukhin@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Hi Eric,<div><br></div><div>We indeed have a swarm of different options, and I’d be happy to simplify it. However, most of them (actually, all of them I think) are there for a reason, so trying to remove them would probably be painful.</div><div><br></div></div></blockquote><div><br></div><div>Agreed.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><div>Let’s look at what we had there:</div><div>* threshold for unrolling in presence of pragma</div><div>* threshold for unrolling under -Os</div><div>* threshold for unrolling in other cases</div><div>* flag 'is partial unrolling allowed’</div><div><div>* flag 'is runtime unrolling allowed’</div></div><div><br></div></div></blockquote><div><br></div><div>These seem to make sense (and are dependent, effectively, on IR).</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><div>Now I've added two more options:</div><div>* ‘absolute’ threshold</div><div>* percent of optimized for complete unroll</div><div><br></div><div>I didn’t list an option 'unroll-max-iteration-count-to-analyze' here, because I don’t think anyone needs to tune it at all - it’s more for guarding the algorithm from doing too expensive analysis.</div><div><br></div></div></blockquote><div><br></div><div>Might be nice to figure out how to move these into the thresholds above if we think it's the right way of doing it?</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><div>I do like an idea of simplifying this, but to me it looks that we’ll lose some cases if we just remove one of these thresholds - they cover very different areas. E.g. we can’t properly devise value for OptSize threshold from other thresholds. Similarly, it’s hard to get value for ‘absolute’ threshold from ‘usual’ threshold - the latter deals more with tiny loops, while the former is for unrolling big loops, where we can get a lot from consequent constant-folding. We can choose ‘one-fits-all’ value for e.g. the percent and remove the corresponding field from TTI, but I doubt that’ll help much here (we’ll still have the parameter in our model, it’ll just become hidden).</div><div><br></div><div>Having said that, I think that the names I chose for the new ones are not that good, and might be confusing (but I currently can’t come up with better ones). If you have a better idea for their names, or if you can suggest how can we simplify the overall scheme, I’d be happy to address it:)</div><div><br></div><div>Thanks,</div><div>Michael</div></div><div style="word-wrap:break-word"><div><br></div><div><br></div><div><div><blockquote type="cite"><div>On Feb 9, 2015, at 1:02 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>> wrote:</div><br><div><div dir="ltr">Drive by review questions:<br><br>1) There are some uncomfortable tuning parameters here, can we figure out some better ideas using science?<div>2) There are also a huge number of tuning parameters for the pass and this is just adding more, what's up here?<div><br></div><div>In other words the pass is starting to look like a sea of options + TTI hell. :)</div><div><br></div><div>-eric</div><div><br></div><div class="gmail_quote">On Mon Feb 09 2015 at 12:53:59 PM Michael Zolotukhin <<a href="mailto:mzolotukhin@apple.com" target="_blank">mzolotukhin@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Hal,<br>
<br>
Could you please take a look at the attached test? Does it cover the new features enough, or did I miss anything?<br>
<br>
<br>
<br>
<br>
Thanks,<br>
Michael<br>
<br>
> On Feb 6, 2015, at 12:31 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>> wrote:<br>
><br>
> As you stated in your follow-up e-mail ;) -- this needs a test case.<br>
><br>
> -Hal<br>
><br>
> ----- Original Message -----<br>
>> From: "Michael Zolotukhin" <<a href="mailto:mzolotukhin@apple.com" target="_blank">mzolotukhin@apple.com</a>><br>
>> To: <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
>> Sent: Friday, February 6, 2015 2:20:40 PM<br>
>> Subject: [llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.<br>
>><br>
>> Author: mzolotukhin<br>
>> Date: Fri Feb  6 14:20:40 2015<br>
>> New Revision: 228434<br>
>><br>
>> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=228434&view=rev" target="_blank">http://llvm.org/viewvc/llvm-<u></u>pr<u></u>oject?rev=228434&view=rev</a><br>
>> Log:<br>
>> Use estimated number of optimized insns in unroll-threshold<br>
>> computation.<br>
>><br>
>> If complete-unroll could help us to optimize away N% of instructions,<br>
>> we<br>
>> might want to do this even if the final size would exceed loop-unroll<br>
>> threshold. However, we don't want to unroll huge loop, and we are add<br>
>> AbsoluteThreshold to avoid that - this threshold will never be<br>
>> crossed,<br>
>> even if we expect to optimize 99% instructions after that.<br>
>><br>
>> Modified:<br>
>>    llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h<br>
>>    llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp<br>
>><br>
>> Modified: llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h<br>
>> URL:<br>
>> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff" target="_blank">http://llvm.org/viewvc/llvm-<u></u>pr<u></u>oject/llvm/trunk/include/<u></u>llvm/<u></u>Analysis/<u></u>TargetTransformInfo.<u></u>h?rev=<u></u>228434&r1=228433&r2=<u></u>228434&<u></u>view=diff</a><br>
>> ==============================<u></u><u></u>==============================<u></u><u></u>==================<br>
>> --- llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h (original)<br>
>> +++ llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h Fri Feb  6<br>
>> 14:20:40 2015<br>
>> @@ -217,6 +217,13 @@ public:<br>
>>     /// exceed this cost. Set this to UINT_MAX to disable the loop<br>
>>     body cost<br>
>>     /// restriction.<br>
>>     unsigned Threshold;<br>
>> +    /// If complete unrolling could help other optimizations (e.g.<br>
>> InstSimplify)<br>
>> +    /// to remove N% of instructions, then we can go beyond unroll<br>
>> threshold.<br>
>> +    /// This value set the minimal percent for allowing that.<br>
>> +    unsigned MinPercentOfOptimized;<br>
>> +    /// The absolute cost threshold. We won't go beyond this even if<br>
>> complete<br>
>> +    /// unrolling could result in optimizing out 90% of<br>
>> instructions.<br>
>> +    unsigned AbsoluteThreshold;<br>
>>     /// The cost threshold for the unrolled loop when optimizing for<br>
>>     size (set<br>
>>     /// to UINT_MAX to disable).<br>
>>     unsigned OptSizeThreshold;<br>
>><br>
>> Modified: llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp<br>
>> URL:<br>
>> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff" target="_blank">http://llvm.org/viewvc/llvm-<u></u>pr<u></u>oject/llvm/trunk/lib/<u></u>Transform<u></u>s/Scalar/<u></u>LoopUnrollPass.cpp?<u></u>rev=228434&<u></u>r1=228433&r2=<u></u>228434&view=diff</a><br>
>> ==============================<u></u><u></u>==============================<u></u><u></u>==================<br>
>> --- llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp (original)<br>
>> +++ llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp Fri Feb  6<br>
>> 14:20:40 2015<br>
>> @@ -45,6 +45,17 @@ static cl::opt<unsigned> UnrollMaxIterat<br>
>>     cl::desc("Don't allow loop unrolling to simulate more than this<br>
>>     number of"<br>
>>              "iterations when checking full unroll profitability"));<br>
>><br>
>> +static cl::opt<unsigned> UnrollMinPercentOfOptimized(<br>
>> +    "unroll-percent-of-optimized-<u></u>f<u></u>or-complete-unroll", cl::init(20),<br>
>> cl::Hidden,<br>
>> +    cl::desc("If complete unrolling could trigger further<br>
>> optimizations, and, "<br>
>> +             "by that, remove the given percent of instructions,<br>
>> perform the "<br>
>> +             "complete unroll even if it's beyond the threshold"));<br>
>> +<br>
>> +static cl::opt<unsigned> UnrollAbsoluteThreshold(<br>
>> +    "unroll-absolute-threshold", cl::init(2000), cl::Hidden,<br>
>> +    cl::desc("Don't unroll if the unrolled size is bigger than this<br>
>> threshold,"<br>
>> +             " even if we can remove big portion of instructions<br>
>> later."));<br>
>> +<br>
>> static cl::opt<unsigned><br>
>> UnrollCount("unroll-count", cl::init(0), cl::Hidden,<br>
>>   cl::desc("Use this unroll count for all loops including those with<br>
>>   "<br>
>> @@ -70,11 +81,16 @@ namespace {<br>
>>     static char ID; // Pass ID, replacement for typeid<br>
>>     LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) :<br>
>>     LoopPass(ID) {<br>
>>       CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);<br>
>> +      CurrentAbsoluteThreshold = UnrollAbsoluteThreshold;<br>
>> +      CurrentMinPercentOfOptimized = UnrollMinPercentOfOptimized;<br>
>>       CurrentCount = (C == -1) ? UnrollCount : unsigned(C);<br>
>>       CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :<br>
>>       (bool)P;<br>
>>       CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;<br>
>><br>
>>       UserThreshold = (T != -1) ||<br>
>>       (UnrollThreshold.<u></u>getNumOccurr<u></u>ences() > 0);<br>
>> +      UserAbsoluteThreshold =<br>
>> (UnrollAbsoluteThreshold.<u></u>getNu<u></u>mOccurrences() > 0);<br>
>> +      UserPercentOfOptimized =<br>
>> +          (UnrollMinPercentOfOptimized.<u></u>g<u></u>etNumOccurrences() > 0);<br>
>>       UserAllowPartial = (P != -1) ||<br>
>>                          (UnrollAllowPartial.<u></u>getNumOccu<u></u>rrences() ><br>
>>                          0);<br>
>>       UserRuntime = (R != -1) || (UnrollRuntime.<u></u>getNumOccurrenc<u></u>es()<br>
>>> 0);<br>
>> @@ -98,10 +114,16 @@ namespace {<br>
>><br>
>>     unsigned CurrentCount;<br>
>>     unsigned CurrentThreshold;<br>
>> +    unsigned CurrentAbsoluteThreshold;<br>
>> +    unsigned CurrentMinPercentOfOptimized;<br>
>>     bool     CurrentAllowPartial;<br>
>>     bool     CurrentRuntime;<br>
>>     bool     UserCount;            // CurrentCount is<br>
>>     user-specified.<br>
>>     bool     UserThreshold;        // CurrentThreshold is<br>
>>     user-specified.<br>
>> +    bool UserAbsoluteThreshold;    // CurrentAbsoluteThreshold is<br>
>> +                                   // user-specified.<br>
>> +    bool UserPercentOfOptimized;   // CurrentMinPercentOfOptimized<br>
>> is<br>
>> +                                   // user-specified.<br>
>>     bool     UserAllowPartial;     // CurrentAllowPartial is<br>
>>     user-specified.<br>
>>     bool     UserRuntime;          // CurrentRuntime is<br>
>>     user-specified.<br>
>><br>
>> @@ -133,6 +155,8 @@ namespace {<br>
>>     void getUnrollingPreferences(Loop *L, const TargetTransformInfo<br>
>>     &TTI,<br>
>>                                  TargetTransformInfo::<u></u>Unrolling<u></u>Preferences<br>
>>                                  &UP) {<br>
>>       UP.Threshold = CurrentThreshold;<br>
>> +      UP.AbsoluteThreshold = CurrentAbsoluteThreshold;<br>
>> +      UP.MinPercentOfOptimized = CurrentMinPercentOfOptimized;<br>
>>       UP.OptSizeThreshold = OptSizeUnrollThreshold;<br>
>>       UP.PartialThreshold = CurrentThreshold;<br>
>>       UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;<br>
>> @@ -160,13 +184,32 @@ namespace {<br>
>>     void selectThresholds(const Loop *L, bool HasPragma,<br>
>>                           const<br>
>>                           TargetTransformInfo::<u></u>Unrollin<u></u>gPreferences<br>
>>                           &UP,<br>
>>                           unsigned &Threshold, unsigned<br>
>>                           &PartialThreshold,<br>
>> -                          unsigned NumberOfSimplifiedInstructions<u></u><u></u>) {<br>
>> +                          unsigned NumberOfOptimizedInstructions) {<br>
>>       // Determine the current unrolling threshold.  While this is<br>
>>       // normally set from UnrollThreshold, it is overridden to a<br>
>>       // smaller value if the current function is marked as<br>
>>       // optimize-for-size, and the unroll threshold was not user<br>
>>       // specified.<br>
>>       Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;<br>
>> +<br>
>> +      // If we are allowed to completely unroll if we can remove M%<br>
>> of<br>
>> +      // instructions, and we know that with complete unrolling<br>
>> we'll be able<br>
>> +      // to kill N instructions, then we can afford to completely<br>
>> unroll loops<br>
>> +      // with unrolled size up to N*100/M.<br>
>> +      // Adjust the threshold according to that:<br>
>> +      unsigned PercentOfOptimizedForCompleteU<u></u><u></u>nroll =<br>
>> +          UserPercentOfOptimized ? CurrentMinPercentOfOptimized<br>
>> +                                 : UP.MinPercentOfOptimized;<br>
>> +      unsigned AbsoluteThreshold = UserAbsoluteThreshold<br>
>> +                                       ? CurrentAbsoluteThreshold<br>
>> +                                       : UP.AbsoluteThreshold;<br>
>> +      if (<u></u>PercentOfOptimizedForComplete<u></u>U<u></u>nroll)<br>
>> +        Threshold = std::max<unsigned>(Threshold,<br>
>> +                                       NumberOfOptimizedInstructions<br>
>> * 100 /<br>
>> +<br>
>>                                          PercentOfOptimizedForCompleteU<u></u><u></u>nroll);<br>
>> +      // But don't allow unrolling loops bigger than absolute<br>
>> threshold.<br>
>> +      Threshold = std::min<unsigned>(Threshold, AbsoluteThreshold);<br>
>> +<br>
>>       PartialThreshold = UserThreshold ? CurrentThreshold :<br>
>>       UP.PartialThreshold;<br>
>>       if (!UserThreshold &&<br>
>>           L->getHeader()->getParent()-><u></u><u></u>getAttributes().<br>
>> @@ -186,7 +229,6 @@ namespace {<br>
>>           PartialThreshold =<br>
>>               std::max<unsigned>(<u></u>PartialThr<u></u>eshold,<br>
>>               PragmaUnrollThreshold);<br>
>>       }<br>
>> -      Threshold += NumberOfSimplifiedInstructions<u></u><u></u>;<br>
>>     }<br>
>>   };<br>
>> }<br>
>><br>
>><br>
>> ______________________________<u></u><u></u>_________________<br>
>> llvm-commits mailing list<br>
>> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailm<u></u>an/listinfo/llvm-commits</a><br>
>><br>
><br>
> --<br>
> Hal Finkel<br>
> Assistant Computational Scientist<br>
> Leadership Computing Facility<br>
> Argonne National Laboratory<br>
<br>
______________________________<u></u><u></u>_________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailm<u></u>an/listinfo/llvm-commits</a><br>
</blockquote></div></div></div>
</div></blockquote></div><br></div></div></blockquote></div></div>