[PATCH] Add getUnrollingPreferences to TTI

Tom Stellard tom at stellard.net
Thu Aug 29 14:18:17 PDT 2013


On Thu, Aug 29, 2013 at 07:54:56AM -0500, Hal Finkel wrote:
> ----- Original Message -----
> > ----- Original Message -----
> > > On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:
> > > > Nadav, et al.,
> > > > 
> > > > The attached patch adds the following interface to TTI:
> 
> I've attached an updated patch. In this version, the caller always initializes the UP structure to the current target-independent defaults before calling getUnrollingPreferences. This simplifies the implementation, and also stabilizes the interface (target code will not need to be updated just because new fields are added).
> 

Hi Hal,

Looks good with the Loop parameter.  I will probably try to implement
this new callback in R600 once this commit lands in the tree.

Thanks,
Tom

>  -Hal
> 
> > > > 
> > > > /// Parameters that control the generic loop unrolling
> > > > transformation.
> > > > struct UnrollingPreferences {
> > > >   unsigned Threshold; ///< The cost threshold for the unrolled
> > > >   loop.
> > > >   unsigned OptSizeThreshold; ///< The cost threshold for the
> > > >   unrolled loop
> > > >                              ///< when optimizing for size.
> > > >   bool     Partial;   ///< Allow partial loop unrolling.
> > > >   bool     Runtime;   ///< Perform runtime unrolling.
> > > > };
> > > > 
> > > > /// \brief Get target-customized preferences for the generic loop
> > > > unrolling
> > > > /// transformation. Returns true if the UnrollingPreferences
> > > > struct
> > > > has been
> > > > /// initialized.
> > > > virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
> > > > const;
> > > > 
> > > > I'd like to use this in the PowerPC backend when targeting the A2
> > > > core. For this target, using more aggressive unrolling helps a
> > > > lot
> > > > because it is in-order with a deep pipeline. As a result,
> > > > unrolling is important for hiding instruction latency and branch
> > > > overhead (especially when combined with the -enable-aa-sched-mi
> > > > functionality).
> > > > 
> > > > I discussed this briefly with Chandler on IRC, and he expressed
> > > > the
> > > > opinion that changing the unrolling factor does not really change
> > > > the canonical form (and TTI is already used to calculate the loop
> > > > body costs), and so this seems like an appropriate use of TTI.
> > > > 
> > > > Please review.
> > > > 
> > > > Thanks again,
> > > > Hal
> > > > 
> > > > --
> > > > Hal Finkel
> > > > Assistant Computational Scientist
> > > > Leadership Computing Facility
> > > > Argonne National Laboratory
> > > 
> > > > diff --git a/include/llvm/Analysis/TargetTransformInfo.h
> > > > b/include/llvm/Analysis/TargetTransformInfo.h
> > > > index 06810a7..908612c 100644
> > > > --- a/include/llvm/Analysis/TargetTransformInfo.h
> > > > +++ b/include/llvm/Analysis/TargetTransformInfo.h
> > > > @@ -191,6 +191,20 @@ public:
> > > >    /// incurs significant execution cost.
> > > >    virtual bool isLoweredToCall(const Function *F) const;
> > > >  
> > > > +  /// Parameters that control the generic loop unrolling
> > > > transformation.
> > > > +  struct UnrollingPreferences {
> > > > +    unsigned Threshold; ///< The cost threshold for the unrolled
> > > > loop.
> > > > +    unsigned OptSizeThreshold; ///< The cost threshold for the
> > > > unrolled loop
> > > > +                               ///< when optimizing for size.
> > > > +    bool     Partial;   ///< Allow partial loop unrolling.
> > > > +    bool     Runtime;   ///< Perform runtime unrolling.
> > > > +  };
> > > > +
> > > > +  /// \brief Get target-customized preferences for the generic
> > > > loop unrolling
> > > > +  /// transformation. Returns true if the UnrollingPreferences
> > > > struct has been
> > > > +  /// initialized.
> > > > +  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
> > > > const;
> > > > +
> > > 
> > > Would it be possible to pass a Loop object to this function?  It
> > > would
> > > be useful for the R600 target at least to be able to inspect the
> > > loop
> > > to
> > > decide what threshold to use.
> > > 
> > > For example, it would probably be best for R600 not to unroll loops
> > > with
> > > barrier intrinsics, since there are several restrictions on the
> > > control
> > > flow constructs in which barriers can be used.  On the other hand,
> > > with
> > > loops that index an array in the private address space, R600 may
> > > want
> > > to use a higher than normal threshold if it makes it possible for
> > > SROA
> > > to move the entire array to registers.
> > 
> > I don't see why not ;) --  I've attached a revised patch with this
> > feature (and also with the ability to override the default unrolling
> > count). How's this?
> > 
> >  -Hal
> > 
> > > 
> > > -Tom
> > > 
> > > 
> > > >    /// @}
> > > >  
> > > >    /// \name Scalar Target Information
> > > > diff --git a/lib/Analysis/TargetTransformInfo.cpp
> > > > b/lib/Analysis/TargetTransformInfo.cpp
> > > > index 0a215aa..8a443cc 100644
> > > > --- a/lib/Analysis/TargetTransformInfo.cpp
> > > > +++ b/lib/Analysis/TargetTransformInfo.cpp
> > > > @@ -96,6 +96,11 @@ bool
> > > > TargetTransformInfo::isLoweredToCall(const
> > > > Function *F) const {
> > > >    return PrevTTI->isLoweredToCall(F);
> > > >  }
> > > >  
> > > > +bool TargetTransformInfo::getUnrollingPreferences(
> > > > +                            UnrollingPreferences &UP) const {
> > > > +  return PrevTTI->getUnrollingPreferences(UP);
> > > > +}
> > > > +
> > > >  bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const
> > > >  {
> > > >    return PrevTTI->isLegalAddImmediate(Imm);
> > > >  }
> > > > @@ -461,6 +466,10 @@ struct NoTTI : ImmutablePass,
> > > > TargetTransformInfo {
> > > >      return true;
> > > >    }
> > > >  
> > > > +  virtual bool getUnrollingPreferences(UnrollingPreferences &)
> > > > const {
> > > > +    return false;
> > > > +  }
> > > > +
> > > >    bool isLegalAddImmediate(int64_t Imm) const {
> > > >      return false;
> > > >    }
> > > > diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp
> > > > b/lib/CodeGen/BasicTargetTransformInfo.cpp
> > > > index d5340e6..e1380b7 100644
> > > > --- a/lib/CodeGen/BasicTargetTransformInfo.cpp
> > > > +++ b/lib/CodeGen/BasicTargetTransformInfo.cpp
> > > > @@ -84,6 +84,7 @@ public:
> > > >    virtual unsigned getJumpBufSize() const;
> > > >    virtual bool shouldBuildLookupTables() const;
> > > >    virtual bool haveFastSqrt(Type *Ty) const;
> > > > +  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
> > > > const;
> > > >  
> > > >    /// @}
> > > >  
> > > > @@ -189,6 +190,10 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const
> > > > {
> > > >    return TLI->isTypeLegal(VT) &&
> > > >    TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);
> > > >  }
> > > >  
> > > > +bool BasicTTI::getUnrollingPreferences(UnrollingPreferences &)
> > > > const {
> > > > +  return false;
> > > > +}
> > > > +
> > > >  //===----------------------------------------------------------------------===//
> > > >  //
> > > >  // Calls used by the vectorizers.
> > > > diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > > > b/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > > > index 80d060b..f8ff275 100644
> > > > --- a/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > > > +++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > > > @@ -55,6 +55,8 @@ namespace {
> > > >        CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
> > > >        (bool)P;
> > > >  
> > > >        UserThreshold = (T != -1) ||
> > > >        (UnrollThreshold.getNumOccurrences() > 0);
> > > > +      UserAllowPartial = (P != -1) ||
> > > > +                         (UnrollAllowPartial.getNumOccurrences()
> > > > >
> > > > 0);
> > > >  
> > > >        initializeLoopUnrollPass(*PassRegistry::getPassRegistry());
> > > >      }
> > > > @@ -76,6 +78,7 @@ namespace {
> > > >      unsigned CurrentThreshold;
> > > >      bool     CurrentAllowPartial;
> > > >      bool     UserThreshold;        // CurrentThreshold is
> > > >      user-specified.
> > > > +    bool     UserAllowPartial;     // CurrentAllowPartial is
> > > > user-specified.
> > > >  
> > > >      bool runOnLoop(Loop *L, LPPassManager &LPM);
> > > >  
> > > > @@ -145,16 +148,20 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > > > LPPassManager &LPM) {
> > > >          << "] Loop %" << Header->getName() << "\n");
> > > >    (void)Header;
> > > >  
> > > > +  TargetTransformInfo::UnrollingPreferences UP;
> > > > +  bool HasUP = TTI.getUnrollingPreferences(UP);
> > > > +
> > > >    // Determine the current unrolling threshold.  While this is
> > > >    normally set
> > > >    // from UnrollThreshold, it is overridden to a smaller value
> > > >    if
> > > >    the current
> > > >    // function is marked as optimize-for-size, and the unroll
> > > >    threshold was
> > > >    // not user specified.
> > > > -  unsigned Threshold = CurrentThreshold;
> > > > +  unsigned Threshold = (HasUP && !UserThreshold) ? UP.Threshold
> > > > :
> > > > +
> > > >                                                   CurrentThreshold;
> > > >    if (!UserThreshold &&
> > > >        Header->getParent()->getAttributes().
> > > >          hasAttribute(AttributeSet::FunctionIndex,
> > > >                       Attribute::OptimizeForSize))
> > > > -    Threshold = OptSizeUnrollThreshold;
> > > > +    Threshold = HasUP ? UP.OptSizeThreshold :
> > > > OptSizeUnrollThreshold;
> > > >  
> > > >    // Find trip count and trip multiple if count is not available
> > > >    unsigned TripCount = 0;
> > > > @@ -184,6 +191,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > > > LPPassManager &LPM) {
> > > >      Count = TripCount;
> > > >    }
> > > >  
> > > > +  bool Runtime = (HasUP && UnrollRuntime.getNumOccurrences() ==
> > > > 0)
> > > > ?
> > > > +                 UP.Runtime : UnrollRuntime;
> > > > +
> > > >    // Enforce the threshold.
> > > >    if (Threshold != NoThreshold) {
> > > >      unsigned NumInlineCandidates;
> > > > @@ -204,7 +214,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > > > LPPassManager &LPM) {
> > > >      if (TripCount != 1 && Size > Threshold) {
> > > >        DEBUG(dbgs() << "  Too large to fully unroll with count: "
> > > >        << Count
> > > >              << " because size: " << Size << ">" << Threshold <<
> > > >              "\n");
> > > > -      if (!CurrentAllowPartial && !(UnrollRuntime && TripCount
> > > > ==
> > > > 0)) {
> > > > +      bool AllowPartial = (HasUP && !UserAllowPartial) ?
> > > > UP.Partial :
> > > > +
> > > >                                                         CurrentAllowPartial;
> > > > +      if (!AllowPartial && !(Runtime && TripCount == 0)) {
> > > >          DEBUG(dbgs() << "  will not try to unroll partially
> > > >          because "
> > > >                << "-unroll-allow-partial not given\n");
> > > >          return false;
> > > > @@ -215,7 +227,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > > > LPPassManager &LPM) {
> > > >          while (Count != 0 && TripCount%Count != 0)
> > > >            Count--;
> > > >        }
> > > > -      else if (UnrollRuntime) {
> > > > +      else if (Runtime) {
> > > >          // Reduce unroll count to be a lower power-of-two value
> > > >          while (Count != 0 && Size > Threshold) {
> > > >            Count >>= 1;
> > > > @@ -231,7 +243,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > > > LPPassManager &LPM) {
> > > >    }
> > > >  
> > > >    // Unroll the loop.
> > > > -  if (!UnrollLoop(L, Count, TripCount, UnrollRuntime,
> > > > TripMultiple, LI, &LPM))
> > > > +  if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple,
> > > > LI,
> > > > &LPM))
> > > >      return false;
> > > >  
> > > >    return true;
> > > 
> > > > _______________________________________________
> > > > llvm-commits mailing list
> > > > llvm-commits at cs.uiuc.edu
> > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > > 
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

> diff --git a/include/llvm/Analysis/TargetTransformInfo.h b/include/llvm/Analysis/TargetTransformInfo.h
> index 06810a7..8a51266 100644
> --- a/include/llvm/Analysis/TargetTransformInfo.h
> +++ b/include/llvm/Analysis/TargetTransformInfo.h
> @@ -29,6 +29,7 @@
>  namespace llvm {
>  
>  class GlobalValue;
> +class Loop;
>  class Type;
>  class User;
>  class Value;
> @@ -191,6 +192,23 @@ public:
>    /// incurs significant execution cost.
>    virtual bool isLoweredToCall(const Function *F) const;
>  
> +  /// Parameters that control the generic loop unrolling transformation.
> +  struct UnrollingPreferences {
> +    unsigned Threshold; ///< The cost threshold for the unrolled loop
> +                        ///< (set to UINT_MAX to disable).
> +    unsigned OptSizeThreshold; ///< The cost threshold for the unrolled loop
> +                               ///< when optimizing for size
> +                               ///< (set to UINT_MAX to disable).
> +    unsigned Count;     ///< Force this unrolling count (set to 0 to disable).
> +    bool     Partial;   ///< Allow partial loop unrolling.
> +    bool     Runtime;   ///< Perform runtime unrolling.
> +  };
> +
> +  /// \brief Get target-customized preferences for the generic loop unrolling
> +  /// transformation. The caller will initialize UP with the current
> +  /// target-independent defaults.
> +  virtual void getUnrollingPreferences(Loop *L, UnrollingPreferences &UP) const;
> +
>    /// @}
>  
>    /// \name Scalar Target Information
> diff --git a/lib/Analysis/TargetTransformInfo.cpp b/lib/Analysis/TargetTransformInfo.cpp
> index 8c6005b..b23223d 100644
> --- a/lib/Analysis/TargetTransformInfo.cpp
> +++ b/lib/Analysis/TargetTransformInfo.cpp
> @@ -96,6 +96,11 @@ bool TargetTransformInfo::isLoweredToCall(const Function *F) const {
>    return PrevTTI->isLoweredToCall(F);
>  }
>  
> +void TargetTransformInfo::getUnrollingPreferences(Loop *L,
> +                            UnrollingPreferences &UP) const {
> +  PrevTTI->getUnrollingPreferences(L, UP);
> +}
> +
>  bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const {
>    return PrevTTI->isLegalAddImmediate(Imm);
>  }
> @@ -469,6 +474,8 @@ struct NoTTI : ImmutablePass, TargetTransformInfo {
>      return true;
>    }
>  
> +  void getUnrollingPreferences(Loop *, UnrollingPreferences &) const { }
> +
>    bool isLegalAddImmediate(int64_t Imm) const {
>      return false;
>    }
> diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp b/lib/CodeGen/BasicTargetTransformInfo.cpp
> index d5340e6..9c4b49a 100644
> --- a/lib/CodeGen/BasicTargetTransformInfo.cpp
> +++ b/lib/CodeGen/BasicTargetTransformInfo.cpp
> @@ -84,6 +84,7 @@ public:
>    virtual unsigned getJumpBufSize() const;
>    virtual bool shouldBuildLookupTables() const;
>    virtual bool haveFastSqrt(Type *Ty) const;
> +  virtual void getUnrollingPreferences(Loop *L, UnrollingPreferences &UP) const;
>  
>    /// @}
>  
> @@ -189,6 +190,8 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const {
>    return TLI->isTypeLegal(VT) && TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);
>  }
>  
> +void BasicTTI::getUnrollingPreferences(Loop *, UnrollingPreferences &) const { }
> +
>  //===----------------------------------------------------------------------===//
>  //
>  // Calls used by the vectorizers.
> diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp b/lib/Transforms/Scalar/LoopUnrollPass.cpp
> index 80d060b..27930fa 100644
> --- a/lib/Transforms/Scalar/LoopUnrollPass.cpp
> +++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp
> @@ -55,6 +55,9 @@ namespace {
>        CurrentAllowPartial = (P == -1) ? UnrollAllowPartial : (bool)P;
>  
>        UserThreshold = (T != -1) || (UnrollThreshold.getNumOccurrences() > 0);
> +      UserAllowPartial = (P != -1) ||
> +                         (UnrollAllowPartial.getNumOccurrences() > 0);
> +      UserCount = (C != -1) || (UnrollCount.getNumOccurrences() > 0);
>  
>        initializeLoopUnrollPass(*PassRegistry::getPassRegistry());
>      }
> @@ -75,7 +78,9 @@ namespace {
>      unsigned CurrentCount;
>      unsigned CurrentThreshold;
>      bool     CurrentAllowPartial;
> +    bool     UserCount;            // CurrentCount is user-specified.
>      bool     UserThreshold;        // CurrentThreshold is user-specified.
> +    bool     UserAllowPartial;     // CurrentAllowPartial is user-specified.
>  
>      bool runOnLoop(Loop *L, LPPassManager &LPM);
>  
> @@ -145,16 +150,24 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
>          << "] Loop %" << Header->getName() << "\n");
>    (void)Header;
>  
> +  TargetTransformInfo::UnrollingPreferences UP;
> +  UP.Threshold = CurrentThreshold;
> +  UP.OptSizeThreshold = OptSizeUnrollThreshold;
> +  UP.Count = CurrentCount;
> +  UP.Partial = CurrentAllowPartial;
> +  UP.Runtime = UnrollRuntime;
> +  TTI.getUnrollingPreferences(L, UP);
> +
>    // Determine the current unrolling threshold.  While this is normally set
>    // from UnrollThreshold, it is overridden to a smaller value if the current
>    // function is marked as optimize-for-size, and the unroll threshold was
>    // not user specified.
> -  unsigned Threshold = CurrentThreshold;
> +  unsigned Threshold = !UserThreshold ? UP.Threshold : CurrentThreshold;
>    if (!UserThreshold &&
>        Header->getParent()->getAttributes().
>          hasAttribute(AttributeSet::FunctionIndex,
>                       Attribute::OptimizeForSize))
> -    Threshold = OptSizeUnrollThreshold;
> +    Threshold = UP.OptSizeThreshold;
>  
>    // Find trip count and trip multiple if count is not available
>    unsigned TripCount = 0;
> @@ -167,11 +180,15 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
>      TripCount = SE->getSmallConstantTripCount(L, LatchBlock);
>      TripMultiple = SE->getSmallConstantTripMultiple(L, LatchBlock);
>    }
> +
> +  bool Runtime = UnrollRuntime.getNumOccurrences() == 0 ?
> +                 UP.Runtime : UnrollRuntime;
> +
>    // Use a default unroll-count if the user doesn't specify a value
>    // and the trip count is a run-time value.  The default is different
>    // for run-time or compile-time trip count loops.
> -  unsigned Count = CurrentCount;
> -  if (UnrollRuntime && CurrentCount == 0 && TripCount == 0)
> +  unsigned Count = !UserCount ? UP.Count : CurrentCount;
> +  if (Runtime && Count == 0 && TripCount == 0)
>      Count = UnrollRuntimeCount;
>  
>    if (Count == 0) {
> @@ -204,7 +221,8 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
>      if (TripCount != 1 && Size > Threshold) {
>        DEBUG(dbgs() << "  Too large to fully unroll with count: " << Count
>              << " because size: " << Size << ">" << Threshold << "\n");
> -      if (!CurrentAllowPartial && !(UnrollRuntime && TripCount == 0)) {
> +      bool AllowPartial = !UserAllowPartial ? UP.Partial : CurrentAllowPartial;
> +      if (!AllowPartial && !(Runtime && TripCount == 0)) {
>          DEBUG(dbgs() << "  will not try to unroll partially because "
>                << "-unroll-allow-partial not given\n");
>          return false;
> @@ -215,7 +233,7 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
>          while (Count != 0 && TripCount%Count != 0)
>            Count--;
>        }
> -      else if (UnrollRuntime) {
> +      else if (Runtime) {
>          // Reduce unroll count to be a lower power-of-two value
>          while (Count != 0 && Size > Threshold) {
>            Count >>= 1;
> @@ -231,7 +249,7 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPassManager &LPM) {
>    }
>  
>    // Unroll the loop.
> -  if (!UnrollLoop(L, Count, TripCount, UnrollRuntime, TripMultiple, LI, &LPM))
> +  if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple, LI, &LPM))
>      return false;
>  
>    return true;




More information about the llvm-commits mailing list