<div dir="ltr"><br><br><div class="gmail_quote">On Mon Feb 09 2015 at 2:50:26 PM Michael Zolotukhin <<a href="mailto:mzolotukhin@apple.com">mzolotukhin@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Hi Eric,<div><br></div><div>We indeed have a swarm of different options, and I’d be happy to simplify it. However, most of them (actually, all of them I think) are there for a reason, so trying to remove them would probably be painful.</div><div><br></div></div></blockquote><div><br></div><div>Agreed.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><div>Let’s look at what we had there:</div><div>* threshold for unrolling in presence of pragma</div><div>* threshold for unrolling under -Os</div><div>* threshold for unrolling in other cases</div><div>* flag 'is partial unrolling allowed’</div><div><div>* flag 'is runtime unrolling allowed’</div></div><div><br></div></div></blockquote><div><br></div><div>These seem to make sense (and are dependent, effectively, on IR).</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><div>Now I've added two more options:</div><div>* ‘absolute’ threshold</div><div>* percent of optimized for complete unroll</div><div><br></div><div>I didn’t list an option 'unroll-max-iteration-count-to-analyze' here, because I don’t think anyone needs to tune it at all - it’s more for guarding the algorithm from doing too expensive analysis.</div><div><br></div></div></blockquote><div><br></div><div>Might be nice to figure out how to move these into the thresholds above if we think it's the right way of doing it?</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div></div><div>I do like an idea of simplifying this, but to me it looks that we’ll lose some cases if we just remove one of these thresholds - they cover very different areas. E.g. we can’t properly devise value for OptSize threshold from other thresholds. Similarly, it’s hard to get value for ‘absolute’ threshold from ‘usual’ threshold - the latter deals more with tiny loops, while the former is for unrolling big loops, where we can get a lot from consequent constant-folding. We can choose ‘one-fits-all’ value for e.g. the percent and remove the corresponding field from TTI, but I doubt that’ll help much here (we’ll still have the parameter in our model, it’ll just become hidden).</div><div><br></div><div>Having said that, I think that the names I chose for the new ones are not that good, and might be confusing (but I currently can’t come up with better ones). If you have a better idea for their names, or if you can suggest how can we simplify the overall scheme, I’d be happy to address it:)</div><div><br></div><div>Thanks,</div><div>Michael</div></div><div style="word-wrap:break-word"><div><br></div><div><br></div><div><div><blockquote type="cite"><div>On Feb 9, 2015, at 1:02 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>> wrote:</div><br><div><div dir="ltr">Drive by review questions:<br><br>1) There are some uncomfortable tuning parameters here, can we figure out some better ideas using science?<div>2) There are also a huge number of tuning parameters for the pass and this is just adding more, what's up here?<div><br></div><div>In other words the pass is starting to look like a sea of options + TTI hell. :)</div><div><br></div><div>-eric</div><div><br></div><div class="gmail_quote">On Mon Feb 09 2015 at 12:53:59 PM Michael Zolotukhin <<a href="mailto:mzolotukhin@apple.com" target="_blank">mzolotukhin@apple.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Hal,<br>

<br>

Could you please take a look at the attached test? Does it cover the new features enough, or did I miss anything?<br>

<br>

<br>

<br>

<br>

Thanks,<br>

Michael<br>

<br>

> On Feb 6, 2015, at 12:31 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>> wrote:<br>

><br>

> As you stated in your follow-up e-mail ;) -- this needs a test case.<br>

><br>

> -Hal<br>

><br>

> ----- Original Message -----<br>

>> From: "Michael Zolotukhin" <<a href="mailto:mzolotukhin@apple.com" target="_blank">mzolotukhin@apple.com</a>><br>

>> To: <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>> Sent: Friday, February 6, 2015 2:20:40 PM<br>

>> Subject: [llvm] r228434 - Use estimated number of optimized insns in unroll-threshold computation.<br>

>><br>

>> Author: mzolotukhin<br>

>> Date: Fri Feb  6 14:20:40 2015<br>

>> New Revision: 228434<br>

>><br>

>> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=228434&view=rev" target="_blank">http://llvm.org/viewvc/llvm-<u></u>pr<u></u>oject?rev=228434&view=rev</a><br>

>> Log:<br>

>> Use estimated number of optimized insns in unroll-threshold<br>

>> computation.<br>

>><br>

>> If complete-unroll could help us to optimize away N% of instructions,<br>

>> we<br>

>> might want to do this even if the final size would exceed loop-unroll<br>

>> threshold. However, we don't want to unroll huge loop, and we are add<br>

>> AbsoluteThreshold to avoid that - this threshold will never be<br>

>> crossed,<br>

>> even if we expect to optimize 99% instructions after that.<br>

>><br>

>> Modified:<br>

>>    llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h<br>

>>    llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp<br>

>><br>

>> Modified: llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h<br>

>> URL:<br>

>> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h?rev=228434&r1=228433&r2=228434&view=diff" target="_blank">http://llvm.org/viewvc/llvm-<u></u>pr<u></u>oject/llvm/trunk/include/<u></u>llvm/<u></u>Analysis/<u></u>TargetTransformInfo.<u></u>h?rev=<u></u>228434&r1=228433&r2=<u></u>228434&<u></u>view=diff</a><br>

>> ==============================<u></u><u></u>==============================<u></u><u></u>==================<br>

>> --- llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h (original)<br>

>> +++ llvm/trunk/include/llvm/<u></u>Analys<u></u>is/TargetTransformInfo.h Fri Feb  6<br>

>> 14:20:40 2015<br>

>> @@ -217,6 +217,13 @@ public:<br>

>>     /// exceed this cost. Set this to UINT_MAX to disable the loop<br>

>>     body cost<br>

>>     /// restriction.<br>

>>     unsigned Threshold;<br>

>> +    /// If complete unrolling could help other optimizations (e.g.<br>

>> InstSimplify)<br>

>> +    /// to remove N% of instructions, then we can go beyond unroll<br>

>> threshold.<br>

>> +    /// This value set the minimal percent for allowing that.<br>

>> +    unsigned MinPercentOfOptimized;<br>

>> +    /// The absolute cost threshold. We won't go beyond this even if<br>

>> complete<br>

>> +    /// unrolling could result in optimizing out 90% of<br>

>> instructions.<br>

>> +    unsigned AbsoluteThreshold;<br>

>>     /// The cost threshold for the unrolled loop when optimizing for<br>

>>     size (set<br>

>>     /// to UINT_MAX to disable).<br>

>>     unsigned OptSizeThreshold;<br>

>><br>

>> Modified: llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp<br>

>> URL:<br>

>> <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=228434&r1=228433&r2=228434&view=diff" target="_blank">http://llvm.org/viewvc/llvm-<u></u>pr<u></u>oject/llvm/trunk/lib/<u></u>Transform<u></u>s/Scalar/<u></u>LoopUnrollPass.cpp?<u></u>rev=228434&<u></u>r1=228433&r2=<u></u>228434&view=diff</a><br>

>> ==============================<u></u><u></u>==============================<u></u><u></u>==================<br>

>> --- llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp (original)<br>

>> +++ llvm/trunk/lib/Transforms/<u></u>Scal<u></u>ar/LoopUnrollPass.cpp Fri Feb  6<br>

>> 14:20:40 2015<br>

>> @@ -45,6 +45,17 @@ static cl::opt<unsigned> UnrollMaxIterat<br>

>>     cl::desc("Don't allow loop unrolling to simulate more than this<br>

>>     number of"<br>

>>              "iterations when checking full unroll profitability"));<br>

>><br>

>> +static cl::opt<unsigned> UnrollMinPercentOfOptimized(<br>

>> +    "unroll-percent-of-optimized-<u></u>f<u></u>or-complete-unroll", cl::init(20),<br>

>> cl::Hidden,<br>

>> +    cl::desc("If complete unrolling could trigger further<br>

>> optimizations, and, "<br>

>> +             "by that, remove the given percent of instructions,<br>

>> perform the "<br>

>> +             "complete unroll even if it's beyond the threshold"));<br>

>> +<br>

>> +static cl::opt<unsigned> UnrollAbsoluteThreshold(<br>

>> +    "unroll-absolute-threshold", cl::init(2000), cl::Hidden,<br>

>> +    cl::desc("Don't unroll if the unrolled size is bigger than this<br>

>> threshold,"<br>

>> +             " even if we can remove big portion of instructions<br>

>> later."));<br>

>> +<br>

>> static cl::opt<unsigned><br>

>> UnrollCount("unroll-count", cl::init(0), cl::Hidden,<br>

>>   cl::desc("Use this unroll count for all loops including those with<br>

>>   "<br>

>> @@ -70,11 +81,16 @@ namespace {<br>

>>     static char ID; // Pass ID, replacement for typeid<br>

>>     LoopUnroll(int T = -1, int C = -1, int P = -1, int R = -1) :<br>

>>     LoopPass(ID) {<br>

>>       CurrentThreshold = (T == -1) ? UnrollThreshold : unsigned(T);<br>

>> +      CurrentAbsoluteThreshold = UnrollAbsoluteThreshold;<br>

>> +      CurrentMinPercentOfOptimized = UnrollMinPercentOfOptimized;<br>

>>       CurrentCount = (C == -1) ? UnrollCount : unsigned(C);<br>

>>       CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :<br>

>>       (bool)P;<br>

>>       CurrentRuntime = (R == -1) ? UnrollRuntime : (bool)R;<br>

>><br>

>>       UserThreshold = (T != -1) ||<br>

>>       (UnrollThreshold.<u></u>getNumOccurr<u></u>ences() > 0);<br>

>> +      UserAbsoluteThreshold =<br>

>> (UnrollAbsoluteThreshold.<u></u>getNu<u></u>mOccurrences() > 0);<br>

>> +      UserPercentOfOptimized =<br>

>> +          (UnrollMinPercentOfOptimized.<u></u>g<u></u>etNumOccurrences() > 0);<br>

>>       UserAllowPartial = (P != -1) ||<br>

>>                          (UnrollAllowPartial.<u></u>getNumOccu<u></u>rrences() ><br>

>>                          0);<br>

>>       UserRuntime = (R != -1) || (UnrollRuntime.<u></u>getNumOccurrenc<u></u>es()<br>

>>> 0);<br>

>> @@ -98,10 +114,16 @@ namespace {<br>

>><br>

>>     unsigned CurrentCount;<br>

>>     unsigned CurrentThreshold;<br>

>> +    unsigned CurrentAbsoluteThreshold;<br>

>> +    unsigned CurrentMinPercentOfOptimized;<br>

>>     bool     CurrentAllowPartial;<br>

>>     bool     CurrentRuntime;<br>

>>     bool     UserCount;            // CurrentCount is<br>

>>     user-specified.<br>

>>     bool     UserThreshold;        // CurrentThreshold is<br>

>>     user-specified.<br>

>> +    bool UserAbsoluteThreshold;    // CurrentAbsoluteThreshold is<br>

>> +                                   // user-specified.<br>

>> +    bool UserPercentOfOptimized;   // CurrentMinPercentOfOptimized<br>

>> is<br>

>> +                                   // user-specified.<br>

>>     bool     UserAllowPartial;     // CurrentAllowPartial is<br>

>>     user-specified.<br>

>>     bool     UserRuntime;          // CurrentRuntime is<br>

>>     user-specified.<br>

>><br>

>> @@ -133,6 +155,8 @@ namespace {<br>

>>     void getUnrollingPreferences(Loop *L, const TargetTransformInfo<br>

>>     &TTI,<br>

>>                                  TargetTransformInfo::<u></u>Unrolling<u></u>Preferences<br>

>>                                  &UP) {<br>

>>       UP.Threshold = CurrentThreshold;<br>

>> +      UP.AbsoluteThreshold = CurrentAbsoluteThreshold;<br>

>> +      UP.MinPercentOfOptimized = CurrentMinPercentOfOptimized;<br>

>>       UP.OptSizeThreshold = OptSizeUnrollThreshold;<br>

>>       UP.PartialThreshold = CurrentThreshold;<br>

>>       UP.PartialOptSizeThreshold = OptSizeUnrollThreshold;<br>

>> @@ -160,13 +184,32 @@ namespace {<br>

>>     void selectThresholds(const Loop *L, bool HasPragma,<br>

>>                           const<br>

>>                           TargetTransformInfo::<u></u>Unrollin<u></u>gPreferences<br>

>>                           &UP,<br>

>>                           unsigned &Threshold, unsigned<br>

>>                           &PartialThreshold,<br>

>> -                          unsigned NumberOfSimplifiedInstructions<u></u><u></u>) {<br>

>> +                          unsigned NumberOfOptimizedInstructions) {<br>

>>       // Determine the current unrolling threshold.  While this is<br>

>>       // normally set from UnrollThreshold, it is overridden to a<br>

>>       // smaller value if the current function is marked as<br>

>>       // optimize-for-size, and the unroll threshold was not user<br>

>>       // specified.<br>

>>       Threshold = UserThreshold ? CurrentThreshold : UP.Threshold;<br>

>> +<br>

>> +      // If we are allowed to completely unroll if we can remove M%<br>

>> of<br>

>> +      // instructions, and we know that with complete unrolling<br>

>> we'll be able<br>

>> +      // to kill N instructions, then we can afford to completely<br>

>> unroll loops<br>

>> +      // with unrolled size up to N*100/M.<br>

>> +      // Adjust the threshold according to that:<br>

>> +      unsigned PercentOfOptimizedForCompleteU<u></u><u></u>nroll =<br>

>> +          UserPercentOfOptimized ? CurrentMinPercentOfOptimized<br>

>> +                                 : UP.MinPercentOfOptimized;<br>

>> +      unsigned AbsoluteThreshold = UserAbsoluteThreshold<br>

>> +                                       ? CurrentAbsoluteThreshold<br>

>> +                                       : UP.AbsoluteThreshold;<br>

>> +      if (<u></u>PercentOfOptimizedForComplete<u></u>U<u></u>nroll)<br>

>> +        Threshold = std::max<unsigned>(Threshold,<br>

>> +                                       NumberOfOptimizedInstructions<br>

>> * 100 /<br>

>> +<br>

>>                                          PercentOfOptimizedForCompleteU<u></u><u></u>nroll);<br>

>> +      // But don't allow unrolling loops bigger than absolute<br>

>> threshold.<br>

>> +      Threshold = std::min<unsigned>(Threshold, AbsoluteThreshold);<br>

>> +<br>

>>       PartialThreshold = UserThreshold ? CurrentThreshold :<br>

>>       UP.PartialThreshold;<br>

>>       if (!UserThreshold &&<br>

>>           L->getHeader()->getParent()-><u></u><u></u>getAttributes().<br>

>> @@ -186,7 +229,6 @@ namespace {<br>

>>           PartialThreshold =<br>

>>               std::max<unsigned>(<u></u>PartialThr<u></u>eshold,<br>

>>               PragmaUnrollThreshold);<br>

>>       }<br>

>> -      Threshold += NumberOfSimplifiedInstructions<u></u><u></u>;<br>

>>     }<br>

>>   };<br>

>> }<br>

>><br>

>><br>

>> ______________________________<u></u><u></u>_________________<br>

>> llvm-commits mailing list<br>

>> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailm<u></u>an/listinfo/llvm-commits</a><br>

>><br>

><br>

> --<br>

> Hal Finkel<br>

> Assistant Computational Scientist<br>

> Leadership Computing Facility<br>

> Argonne National Laboratory<br>

<br>

______________________________<u></u><u></u>_________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailm<u></u>an/listinfo/llvm-commits</a><br>

</blockquote></div></div></div>

</div></blockquote></div><br></div></div></blockquote></div></div>