<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Aug 29, 2013, at 5:54 AM, Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">----- Original Message -----<br><blockquote type="cite">----- Original Message -----<br><blockquote type="cite">On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:<br><blockquote type="cite">Nadav, et al.,<br><br>The attached patch adds the following interface to TTI:<br></blockquote></blockquote></blockquote><br>I've attached an updated patch. In this version, the caller always initializes the UP structure to the current target-independent defaults before calling getUnrollingPreferences. This simplifies the implementation, and also stabilizes the interface (target code will not need to be updated just because new fields are added).<br></div></blockquote><div><br></div><div>I misread this initially:</div><div><br></div><div>+ unsigned Count = !UserCount ? UP.Count : CurrentCount;</div><div><br></div><div>It should be:</div><div><br></div><div>> unsigned Count = UserCount ? CurrentCount : UP.Count;</div><div><br></div><div>-Andy</div><div><br></div><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br>/// Parameters that control the generic loop unrolling<br>transformation.<br>struct UnrollingPreferences {<br> unsigned Threshold; ///< The cost threshold for the unrolled<br> loop.<br> unsigned OptSizeThreshold; ///< The cost threshold for the<br> unrolled loop<br> ///< when optimizing for size.<br> bool Partial; ///< Allow partial loop unrolling.<br> bool Runtime; ///< Perform runtime unrolling.<br>};<br><br>/// \brief Get target-customized preferences for the generic loop<br>unrolling<br>/// transformation. Returns true if the UnrollingPreferences<br>struct<br>has been<br>/// initialized.<br>virtual bool getUnrollingPreferences(UnrollingPreferences &UP)<br>const;<br><br>I'd like to use this in the PowerPC backend when targeting the A2<br>core. For this target, using more aggressive unrolling helps a<br>lot<br>because it is in-order with a deep pipeline. As a result,<br>unrolling is important for hiding instruction latency and branch<br>overhead (especially when combined with the -enable-aa-sched-mi<br>functionality).<br><br>I discussed this briefly with Chandler on IRC, and he expressed<br>the<br>opinion that changing the unrolling factor does not really change<br>the canonical form (and TTI is already used to calculate the loop<br>body costs), and so this seems like an appropriate use of TTI.<br><br>Please review.<br><br>Thanks again,<br>Hal<br><br>--<br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<br></blockquote><br><blockquote type="cite">diff --git a/include/llvm/Analysis/TargetTransformInfo.h<br>b/include/llvm/Analysis/TargetTransformInfo.h<br>index 06810a7..908612c 100644<br>--- a/include/llvm/Analysis/TargetTransformInfo.h<br>+++ b/include/llvm/Analysis/TargetTransformInfo.h<br>@@ -191,6 +191,20 @@ public:<br> /// incurs significant execution cost.<br> virtual bool isLoweredToCall(const Function *F) const;<br><br>+ /// Parameters that control the generic loop unrolling<br>transformation.<br>+ struct UnrollingPreferences {<br>+ unsigned Threshold; ///< The cost threshold for the unrolled<br>loop.<br>+ unsigned OptSizeThreshold; ///< The cost threshold for the<br>unrolled loop<br>+ ///< when optimizing for size.<br>+ bool Partial; ///< Allow partial loop unrolling.<br>+ bool Runtime; ///< Perform runtime unrolling.<br>+ };<br>+<br>+ /// \brief Get target-customized preferences for the generic<br>loop unrolling<br>+ /// transformation. Returns true if the UnrollingPreferences<br>struct has been<br>+ /// initialized.<br>+ virtual bool getUnrollingPreferences(UnrollingPreferences &UP)<br>const;<br>+<br></blockquote><br>Would it be possible to pass a Loop object to this function? It<br>would<br>be useful for the R600 target at least to be able to inspect the<br>loop<br>to<br>decide what threshold to use.<br><br>For example, it would probably be best for R600 not to unroll loops<br>with<br>barrier intrinsics, since there are several restrictions on the<br>control<br>flow constructs in which barriers can be used. On the other hand,<br>with<br>loops that index an array in the private address space, R600 may<br>want<br>to use a higher than normal threshold if it makes it possible for<br>SROA<br>to move the entire array to registers.<br></blockquote><br>I don't see why not ;) -- I've attached a revised patch with this<br>feature (and also with the ability to override the default unrolling<br>count). How's this?<br><br>-Hal<br><br><blockquote type="cite"><br>-Tom<br><br><br><blockquote type="cite"> /// @}<br><br> /// \name Scalar Target Information<br>diff --git a/lib/Analysis/TargetTransformInfo.cpp<br>b/lib/Analysis/TargetTransformInfo.cpp<br>index 0a215aa..8a443cc 100644<br>--- a/lib/Analysis/TargetTransformInfo.cpp<br>+++ b/lib/Analysis/TargetTransformInfo.cpp<br>@@ -96,6 +96,11 @@ bool<br>TargetTransformInfo::isLoweredToCall(const<br>Function *F) const {<br> return PrevTTI->isLoweredToCall(F);<br>}<br><br>+bool TargetTransformInfo::getUnrollingPreferences(<br>+ UnrollingPreferences &UP) const {<br>+ return PrevTTI->getUnrollingPreferences(UP);<br>+}<br>+<br>bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const<br>{<br> return PrevTTI->isLegalAddImmediate(Imm);<br>}<br>@@ -461,6 +466,10 @@ struct NoTTI : ImmutablePass,<br>TargetTransformInfo {<br> return true;<br> }<br><br>+ virtual bool getUnrollingPreferences(UnrollingPreferences &)<br>const {<br>+ return false;<br>+ }<br>+<br> bool isLegalAddImmediate(int64_t Imm) const {<br> return false;<br> }<br>diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp<br>b/lib/CodeGen/BasicTargetTransformInfo.cpp<br>index d5340e6..e1380b7 100644<br>--- a/lib/CodeGen/BasicTargetTransformInfo.cpp<br>+++ b/lib/CodeGen/BasicTargetTransformInfo.cpp<br>@@ -84,6 +84,7 @@ public:<br> virtual unsigned getJumpBufSize() const;<br> virtual bool shouldBuildLookupTables() const;<br> virtual bool haveFastSqrt(Type *Ty) const;<br>+ virtual bool getUnrollingPreferences(UnrollingPreferences &UP)<br>const;<br><br> /// @}<br><br>@@ -189,6 +190,10 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const<br>{<br> return TLI->isTypeLegal(VT) &&<br> TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);<br>}<br><br>+bool BasicTTI::getUnrollingPreferences(UnrollingPreferences &)<br>const {<br>+ return false;<br>+}<br>+<br>//===----------------------------------------------------------------------===//<br>//<br>// Calls used by the vectorizers.<br>diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>b/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>index 80d060b..f8ff275 100644<br>--- a/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>+++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>@@ -55,6 +55,8 @@ namespace {<br> CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :<br> (bool)P;<br><br> UserThreshold = (T != -1) ||<br> (UnrollThreshold.getNumOccurrences() > 0);<br>+ UserAllowPartial = (P != -1) ||<br>+ (UnrollAllowPartial.getNumOccurrences()<br><blockquote type="cite"><br></blockquote>0);<br><br> initializeLoopUnrollPass(*PassRegistry::getPassRegistry());<br> }<br>@@ -76,6 +78,7 @@ namespace {<br> unsigned CurrentThreshold;<br> bool CurrentAllowPartial;<br> bool UserThreshold; // CurrentThreshold is<br> user-specified.<br>+ bool UserAllowPartial; // CurrentAllowPartial is<br>user-specified.<br><br> bool runOnLoop(Loop *L, LPPassManager &LPM);<br><br>@@ -145,16 +148,20 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br> << "] Loop %" << Header->getName() << "\n");<br> (void)Header;<br><br>+ TargetTransformInfo::UnrollingPreferences UP;<br>+ bool HasUP = TTI.getUnrollingPreferences(UP);<br>+<br> // Determine the current unrolling threshold. While this is<br> normally set<br> // from UnrollThreshold, it is overridden to a smaller value<br> if<br> the current<br> // function is marked as optimize-for-size, and the unroll<br> threshold was<br> // not user specified.<br>- unsigned Threshold = CurrentThreshold;<br>+ unsigned Threshold = (HasUP && !UserThreshold) ? UP.Threshold<br>:<br>+<br> CurrentThreshold;<br> if (!UserThreshold &&<br> Header->getParent()->getAttributes().<br> hasAttribute(AttributeSet::FunctionIndex,<br> Attribute::OptimizeForSize))<br>- Threshold = OptSizeUnrollThreshold;<br>+ Threshold = HasUP ? UP.OptSizeThreshold :<br>OptSizeUnrollThreshold;<br><br> // Find trip count and trip multiple if count is not available<br> unsigned TripCount = 0;<br>@@ -184,6 +191,9 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br> Count = TripCount;<br> }<br><br>+ bool Runtime = (HasUP && UnrollRuntime.getNumOccurrences() ==<br>0)<br>?<br>+ UP.Runtime : UnrollRuntime;<br>+<br> // Enforce the threshold.<br> if (Threshold != NoThreshold) {<br> unsigned NumInlineCandidates;<br>@@ -204,7 +214,9 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br> if (TripCount != 1 && Size > Threshold) {<br> DEBUG(dbgs() << " Too large to fully unroll with count: "<br> << Count<br> << " because size: " << Size << ">" << Threshold <<<br> "\n");<br>- if (!CurrentAllowPartial && !(UnrollRuntime && TripCount<br>==<br>0)) {<br>+ bool AllowPartial = (HasUP && !UserAllowPartial) ?<br>UP.Partial :<br>+<br> CurrentAllowPartial;<br>+ if (!AllowPartial && !(Runtime && TripCount == 0)) {<br> DEBUG(dbgs() << " will not try to unroll partially<br> because "<br> << "-unroll-allow-partial not given\n");<br> return false;<br>@@ -215,7 +227,7 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br> while (Count != 0 && TripCount%Count != 0)<br> Count--;<br> }<br>- else if (UnrollRuntime) {<br>+ else if (Runtime) {<br> // Reduce unroll count to be a lower power-of-two value<br> while (Count != 0 && Size > Threshold) {<br> Count >>= 1;<br>@@ -231,7 +243,7 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br> }<br><br> // Unroll the loop.<br>- if (!UnrollLoop(L, Count, TripCount, UnrollRuntime,<br>TripMultiple, LI, &LPM))<br>+ if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple,<br>LI,<br>&LPM))<br> return false;<br><br> return true;<br></blockquote><br><blockquote type="cite">_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br></blockquote><br><br></blockquote><br>--<br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<br><br>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br><br></blockquote><br>--<span class="Apple-converted-space"> </span><br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<span><unroll_tti-v3.patch></span>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a></div></blockquote></div><br></body></html>