<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Aug 29, 2013, at 5:54 AM, Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">----- Original Message -----<br><blockquote type="cite">----- Original Message -----<br><blockquote type="cite">On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:<br><blockquote type="cite">Nadav, et al.,<br><br>The attached patch adds the following interface to TTI:<br></blockquote></blockquote></blockquote><br>I've attached an updated patch. In this version, the caller always initializes the UP structure to the current target-independent defaults before calling getUnrollingPreferences. This simplifies the implementation, and also stabilizes the interface (target code will not need to be updated just because new fields are added).<br></div></blockquote><div><br></div><div>I misread this initially:</div><div><br></div><div>+  unsigned Count = !UserCount ? UP.Count : CurrentCount;</div><div><br></div><div>It should be:</div><div><br></div><div>>  unsigned Count = UserCount ? CurrentCount : UP.Count;</div><div><br></div><div>-Andy</div><div><br></div><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><br>/// Parameters that control the generic loop unrolling<br>transformation.<br>struct UnrollingPreferences {<br> unsigned Threshold; ///< The cost threshold for the unrolled<br> loop.<br> unsigned OptSizeThreshold; ///< The cost threshold for the<br> unrolled loop<br>                            ///< when optimizing for size.<br> bool     Partial;   ///< Allow partial loop unrolling.<br> bool     Runtime;   ///< Perform runtime unrolling.<br>};<br><br>/// \brief Get target-customized preferences for the generic loop<br>unrolling<br>/// transformation. Returns true if the UnrollingPreferences<br>struct<br>has been<br>/// initialized.<br>virtual bool getUnrollingPreferences(UnrollingPreferences &UP)<br>const;<br><br>I'd like to use this in the PowerPC backend when targeting the A2<br>core. For this target, using more aggressive unrolling helps a<br>lot<br>because it is in-order with a deep pipeline. As a result,<br>unrolling is important for hiding instruction latency and branch<br>overhead (especially when combined with the -enable-aa-sched-mi<br>functionality).<br><br>I discussed this briefly with Chandler on IRC, and he expressed<br>the<br>opinion that changing the unrolling factor does not really change<br>the canonical form (and TTI is already used to calculate the loop<br>body costs), and so this seems like an appropriate use of TTI.<br><br>Please review.<br><br>Thanks again,<br>Hal<br><br>--<br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<br></blockquote><br><blockquote type="cite">diff --git a/include/llvm/Analysis/TargetTransformInfo.h<br>b/include/llvm/Analysis/TargetTransformInfo.h<br>index 06810a7..908612c 100644<br>--- a/include/llvm/Analysis/TargetTransformInfo.h<br>+++ b/include/llvm/Analysis/TargetTransformInfo.h<br>@@ -191,6 +191,20 @@ public:<br>  /// incurs significant execution cost.<br>  virtual bool isLoweredToCall(const Function *F) const;<br><br>+  /// Parameters that control the generic loop unrolling<br>transformation.<br>+  struct UnrollingPreferences {<br>+    unsigned Threshold; ///< The cost threshold for the unrolled<br>loop.<br>+    unsigned OptSizeThreshold; ///< The cost threshold for the<br>unrolled loop<br>+                               ///< when optimizing for size.<br>+    bool     Partial;   ///< Allow partial loop unrolling.<br>+    bool     Runtime;   ///< Perform runtime unrolling.<br>+  };<br>+<br>+  /// \brief Get target-customized preferences for the generic<br>loop unrolling<br>+  /// transformation. Returns true if the UnrollingPreferences<br>struct has been<br>+  /// initialized.<br>+  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)<br>const;<br>+<br></blockquote><br>Would it be possible to pass a Loop object to this function?  It<br>would<br>be useful for the R600 target at least to be able to inspect the<br>loop<br>to<br>decide what threshold to use.<br><br>For example, it would probably be best for R600 not to unroll loops<br>with<br>barrier intrinsics, since there are several restrictions on the<br>control<br>flow constructs in which barriers can be used.  On the other hand,<br>with<br>loops that index an array in the private address space, R600 may<br>want<br>to use a higher than normal threshold if it makes it possible for<br>SROA<br>to move the entire array to registers.<br></blockquote><br>I don't see why not ;) --  I've attached a revised patch with this<br>feature (and also with the ability to override the default unrolling<br>count). How's this?<br><br>-Hal<br><br><blockquote type="cite"><br>-Tom<br><br><br><blockquote type="cite">  /// @}<br><br>  /// \name Scalar Target Information<br>diff --git a/lib/Analysis/TargetTransformInfo.cpp<br>b/lib/Analysis/TargetTransformInfo.cpp<br>index 0a215aa..8a443cc 100644<br>--- a/lib/Analysis/TargetTransformInfo.cpp<br>+++ b/lib/Analysis/TargetTransformInfo.cpp<br>@@ -96,6 +96,11 @@ bool<br>TargetTransformInfo::isLoweredToCall(const<br>Function *F) const {<br>  return PrevTTI->isLoweredToCall(F);<br>}<br><br>+bool TargetTransformInfo::getUnrollingPreferences(<br>+                            UnrollingPreferences &UP) const {<br>+  return PrevTTI->getUnrollingPreferences(UP);<br>+}<br>+<br>bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const<br>{<br>  return PrevTTI->isLegalAddImmediate(Imm);<br>}<br>@@ -461,6 +466,10 @@ struct NoTTI : ImmutablePass,<br>TargetTransformInfo {<br>    return true;<br>  }<br><br>+  virtual bool getUnrollingPreferences(UnrollingPreferences &)<br>const {<br>+    return false;<br>+  }<br>+<br>  bool isLegalAddImmediate(int64_t Imm) const {<br>    return false;<br>  }<br>diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp<br>b/lib/CodeGen/BasicTargetTransformInfo.cpp<br>index d5340e6..e1380b7 100644<br>--- a/lib/CodeGen/BasicTargetTransformInfo.cpp<br>+++ b/lib/CodeGen/BasicTargetTransformInfo.cpp<br>@@ -84,6 +84,7 @@ public:<br>  virtual unsigned getJumpBufSize() const;<br>  virtual bool shouldBuildLookupTables() const;<br>  virtual bool haveFastSqrt(Type *Ty) const;<br>+  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)<br>const;<br><br>  /// @}<br><br>@@ -189,6 +190,10 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const<br>{<br>  return TLI->isTypeLegal(VT) &&<br>  TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);<br>}<br><br>+bool BasicTTI::getUnrollingPreferences(UnrollingPreferences &)<br>const {<br>+  return false;<br>+}<br>+<br>//===----------------------------------------------------------------------===//<br>//<br>// Calls used by the vectorizers.<br>diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>b/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>index 80d060b..f8ff275 100644<br>--- a/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>+++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp<br>@@ -55,6 +55,8 @@ namespace {<br>      CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :<br>      (bool)P;<br><br>      UserThreshold = (T != -1) ||<br>      (UnrollThreshold.getNumOccurrences() > 0);<br>+      UserAllowPartial = (P != -1) ||<br>+                         (UnrollAllowPartial.getNumOccurrences()<br><blockquote type="cite"><br></blockquote>0);<br><br>      initializeLoopUnrollPass(*PassRegistry::getPassRegistry());<br>    }<br>@@ -76,6 +78,7 @@ namespace {<br>    unsigned CurrentThreshold;<br>    bool     CurrentAllowPartial;<br>    bool     UserThreshold;        // CurrentThreshold is<br>    user-specified.<br>+    bool     UserAllowPartial;     // CurrentAllowPartial is<br>user-specified.<br><br>    bool runOnLoop(Loop *L, LPPassManager &LPM);<br><br>@@ -145,16 +148,20 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br>        << "] Loop %" << Header->getName() << "\n");<br>  (void)Header;<br><br>+  TargetTransformInfo::UnrollingPreferences UP;<br>+  bool HasUP = TTI.getUnrollingPreferences(UP);<br>+<br>  // Determine the current unrolling threshold.  While this is<br>  normally set<br>  // from UnrollThreshold, it is overridden to a smaller value<br>  if<br>  the current<br>  // function is marked as optimize-for-size, and the unroll<br>  threshold was<br>  // not user specified.<br>-  unsigned Threshold = CurrentThreshold;<br>+  unsigned Threshold = (HasUP && !UserThreshold) ? UP.Threshold<br>:<br>+<br>                                                 CurrentThreshold;<br>  if (!UserThreshold &&<br>      Header->getParent()->getAttributes().<br>        hasAttribute(AttributeSet::FunctionIndex,<br>                     Attribute::OptimizeForSize))<br>-    Threshold = OptSizeUnrollThreshold;<br>+    Threshold = HasUP ? UP.OptSizeThreshold :<br>OptSizeUnrollThreshold;<br><br>  // Find trip count and trip multiple if count is not available<br>  unsigned TripCount = 0;<br>@@ -184,6 +191,9 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br>    Count = TripCount;<br>  }<br><br>+  bool Runtime = (HasUP && UnrollRuntime.getNumOccurrences() ==<br>0)<br>?<br>+                 UP.Runtime : UnrollRuntime;<br>+<br>  // Enforce the threshold.<br>  if (Threshold != NoThreshold) {<br>    unsigned NumInlineCandidates;<br>@@ -204,7 +214,9 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br>    if (TripCount != 1 && Size > Threshold) {<br>      DEBUG(dbgs() << "  Too large to fully unroll with count: "<br>      << Count<br>            << " because size: " << Size << ">" << Threshold <<<br>            "\n");<br>-      if (!CurrentAllowPartial && !(UnrollRuntime && TripCount<br>==<br>0)) {<br>+      bool AllowPartial = (HasUP && !UserAllowPartial) ?<br>UP.Partial :<br>+<br>                                                       CurrentAllowPartial;<br>+      if (!AllowPartial && !(Runtime && TripCount == 0)) {<br>        DEBUG(dbgs() << "  will not try to unroll partially<br>        because "<br>              << "-unroll-allow-partial not given\n");<br>        return false;<br>@@ -215,7 +227,7 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br>        while (Count != 0 && TripCount%Count != 0)<br>          Count--;<br>      }<br>-      else if (UnrollRuntime) {<br>+      else if (Runtime) {<br>        // Reduce unroll count to be a lower power-of-two value<br>        while (Count != 0 && Size > Threshold) {<br>          Count >>= 1;<br>@@ -231,7 +243,7 @@ bool LoopUnroll::runOnLoop(Loop *L,<br>LPPassManager &LPM) {<br>  }<br><br>  // Unroll the loop.<br>-  if (!UnrollLoop(L, Count, TripCount, UnrollRuntime,<br>TripMultiple, LI, &LPM))<br>+  if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple,<br>LI,<br>&LPM))<br>    return false;<br><br>  return true;<br></blockquote><br><blockquote type="cite">_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br></blockquote><br><br></blockquote><br>--<br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<br><br>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br><br></blockquote><br>--<span class="Apple-converted-space"> </span><br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<span><unroll_tti-v3.patch></span>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a></div></blockquote></div><br></body></html>