[PATCH] Add getUnrollingPreferences to TTI

Hal Finkel hfinkel at anl.gov
Wed Aug 28 22:40:47 PDT 2013


----- Original Message -----
> On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:
> > Nadav, et al.,
> > 
> > The attached patch adds the following interface to TTI:
> > 
> > /// Parameters that control the generic loop unrolling
> > transformation.
> > struct UnrollingPreferences {
> >   unsigned Threshold; ///< The cost threshold for the unrolled
> >   loop.
> >   unsigned OptSizeThreshold; ///< The cost threshold for the
> >   unrolled loop
> >                              ///< when optimizing for size.
> >   bool     Partial;   ///< Allow partial loop unrolling.
> >   bool     Runtime;   ///< Perform runtime unrolling.
> > };
> > 
> > /// \brief Get target-customized preferences for the generic loop
> > unrolling
> > /// transformation. Returns true if the UnrollingPreferences struct
> > has been
> > /// initialized.
> > virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
> > const;
> > 
> > I'd like to use this in the PowerPC backend when targeting the A2
> > core. For this target, using more aggressive unrolling helps a lot
> > because it is in-order with a deep pipeline. As a result,
> > unrolling is important for hiding instruction latency and branch
> > overhead (especially when combined with the -enable-aa-sched-mi
> > functionality).
> > 
> > I discussed this briefly with Chandler on IRC, and he expressed the
> > opinion that changing the unrolling factor does not really change
> > the canonical form (and TTI is already used to calculate the loop
> > body costs), and so this seems like an appropriate use of TTI.
> > 
> > Please review.
> > 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> > diff --git a/include/llvm/Analysis/TargetTransformInfo.h
> > b/include/llvm/Analysis/TargetTransformInfo.h
> > index 06810a7..908612c 100644
> > --- a/include/llvm/Analysis/TargetTransformInfo.h
> > +++ b/include/llvm/Analysis/TargetTransformInfo.h
> > @@ -191,6 +191,20 @@ public:
> >    /// incurs significant execution cost.
> >    virtual bool isLoweredToCall(const Function *F) const;
> >  
> > +  /// Parameters that control the generic loop unrolling
> > transformation.
> > +  struct UnrollingPreferences {
> > +    unsigned Threshold; ///< The cost threshold for the unrolled
> > loop.
> > +    unsigned OptSizeThreshold; ///< The cost threshold for the
> > unrolled loop
> > +                               ///< when optimizing for size.
> > +    bool     Partial;   ///< Allow partial loop unrolling.
> > +    bool     Runtime;   ///< Perform runtime unrolling.
> > +  };
> > +
> > +  /// \brief Get target-customized preferences for the generic
> > loop unrolling
> > +  /// transformation. Returns true if the UnrollingPreferences
> > struct has been
> > +  /// initialized.
> > +  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
> > const;
> > +
> 
> Would it be possible to pass a Loop object to this function?  It
> would
> be useful for the R600 target at least to be able to inspect the loop
> to
> decide what threshold to use.
> 
> For example, it would probably be best for R600 not to unroll loops
> with
> barrier intrinsics, since there are several restrictions on the
> control
> flow constructs in which barriers can be used.  On the other hand,
> with
> loops that index an array in the private address space, R600 may want
> to use a higher than normal threshold if it makes it possible for
> SROA
> to move the entire array to registers.

I don't see why not ;) --  I've attached a revised patch with this feature (and also with the ability to override the default unrolling count). How's this?

 -Hal

> 
> -Tom
> 
> 
> >    /// @}
> >  
> >    /// \name Scalar Target Information
> > diff --git a/lib/Analysis/TargetTransformInfo.cpp
> > b/lib/Analysis/TargetTransformInfo.cpp
> > index 0a215aa..8a443cc 100644
> > --- a/lib/Analysis/TargetTransformInfo.cpp
> > +++ b/lib/Analysis/TargetTransformInfo.cpp
> > @@ -96,6 +96,11 @@ bool TargetTransformInfo::isLoweredToCall(const
> > Function *F) const {
> >    return PrevTTI->isLoweredToCall(F);
> >  }
> >  
> > +bool TargetTransformInfo::getUnrollingPreferences(
> > +                            UnrollingPreferences &UP) const {
> > +  return PrevTTI->getUnrollingPreferences(UP);
> > +}
> > +
> >  bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const {
> >    return PrevTTI->isLegalAddImmediate(Imm);
> >  }
> > @@ -461,6 +466,10 @@ struct NoTTI : ImmutablePass,
> > TargetTransformInfo {
> >      return true;
> >    }
> >  
> > +  virtual bool getUnrollingPreferences(UnrollingPreferences &)
> > const {
> > +    return false;
> > +  }
> > +
> >    bool isLegalAddImmediate(int64_t Imm) const {
> >      return false;
> >    }
> > diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp
> > b/lib/CodeGen/BasicTargetTransformInfo.cpp
> > index d5340e6..e1380b7 100644
> > --- a/lib/CodeGen/BasicTargetTransformInfo.cpp
> > +++ b/lib/CodeGen/BasicTargetTransformInfo.cpp
> > @@ -84,6 +84,7 @@ public:
> >    virtual unsigned getJumpBufSize() const;
> >    virtual bool shouldBuildLookupTables() const;
> >    virtual bool haveFastSqrt(Type *Ty) const;
> > +  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
> > const;
> >  
> >    /// @}
> >  
> > @@ -189,6 +190,10 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const {
> >    return TLI->isTypeLegal(VT) &&
> >    TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);
> >  }
> >  
> > +bool BasicTTI::getUnrollingPreferences(UnrollingPreferences &)
> > const {
> > +  return false;
> > +}
> > +
> >  //===----------------------------------------------------------------------===//
> >  //
> >  // Calls used by the vectorizers.
> > diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > b/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > index 80d060b..f8ff275 100644
> > --- a/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > +++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp
> > @@ -55,6 +55,8 @@ namespace {
> >        CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
> >        (bool)P;
> >  
> >        UserThreshold = (T != -1) ||
> >        (UnrollThreshold.getNumOccurrences() > 0);
> > +      UserAllowPartial = (P != -1) ||
> > +                         (UnrollAllowPartial.getNumOccurrences() >
> > 0);
> >  
> >        initializeLoopUnrollPass(*PassRegistry::getPassRegistry());
> >      }
> > @@ -76,6 +78,7 @@ namespace {
> >      unsigned CurrentThreshold;
> >      bool     CurrentAllowPartial;
> >      bool     UserThreshold;        // CurrentThreshold is
> >      user-specified.
> > +    bool     UserAllowPartial;     // CurrentAllowPartial is
> > user-specified.
> >  
> >      bool runOnLoop(Loop *L, LPPassManager &LPM);
> >  
> > @@ -145,16 +148,20 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > LPPassManager &LPM) {
> >          << "] Loop %" << Header->getName() << "\n");
> >    (void)Header;
> >  
> > +  TargetTransformInfo::UnrollingPreferences UP;
> > +  bool HasUP = TTI.getUnrollingPreferences(UP);
> > +
> >    // Determine the current unrolling threshold.  While this is
> >    normally set
> >    // from UnrollThreshold, it is overridden to a smaller value if
> >    the current
> >    // function is marked as optimize-for-size, and the unroll
> >    threshold was
> >    // not user specified.
> > -  unsigned Threshold = CurrentThreshold;
> > +  unsigned Threshold = (HasUP && !UserThreshold) ? UP.Threshold :
> > +
> >                                                   CurrentThreshold;
> >    if (!UserThreshold &&
> >        Header->getParent()->getAttributes().
> >          hasAttribute(AttributeSet::FunctionIndex,
> >                       Attribute::OptimizeForSize))
> > -    Threshold = OptSizeUnrollThreshold;
> > +    Threshold = HasUP ? UP.OptSizeThreshold :
> > OptSizeUnrollThreshold;
> >  
> >    // Find trip count and trip multiple if count is not available
> >    unsigned TripCount = 0;
> > @@ -184,6 +191,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > LPPassManager &LPM) {
> >      Count = TripCount;
> >    }
> >  
> > +  bool Runtime = (HasUP && UnrollRuntime.getNumOccurrences() == 0)
> > ?
> > +                 UP.Runtime : UnrollRuntime;
> > +
> >    // Enforce the threshold.
> >    if (Threshold != NoThreshold) {
> >      unsigned NumInlineCandidates;
> > @@ -204,7 +214,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > LPPassManager &LPM) {
> >      if (TripCount != 1 && Size > Threshold) {
> >        DEBUG(dbgs() << "  Too large to fully unroll with count: "
> >        << Count
> >              << " because size: " << Size << ">" << Threshold <<
> >              "\n");
> > -      if (!CurrentAllowPartial && !(UnrollRuntime && TripCount ==
> > 0)) {
> > +      bool AllowPartial = (HasUP && !UserAllowPartial) ?
> > UP.Partial :
> > +
> >                                                         CurrentAllowPartial;
> > +      if (!AllowPartial && !(Runtime && TripCount == 0)) {
> >          DEBUG(dbgs() << "  will not try to unroll partially
> >          because "
> >                << "-unroll-allow-partial not given\n");
> >          return false;
> > @@ -215,7 +227,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > LPPassManager &LPM) {
> >          while (Count != 0 && TripCount%Count != 0)
> >            Count--;
> >        }
> > -      else if (UnrollRuntime) {
> > +      else if (Runtime) {
> >          // Reduce unroll count to be a lower power-of-two value
> >          while (Count != 0 && Size > Threshold) {
> >            Count >>= 1;
> > @@ -231,7 +243,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
> > LPPassManager &LPM) {
> >    }
> >  
> >    // Unroll the loop.
> > -  if (!UnrollLoop(L, Count, TripCount, UnrollRuntime,
> > TripMultiple, LI, &LPM))
> > +  if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple, LI,
> > &LPM))
> >      return false;
> >  
> >    return true;
> 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unroll_tti-v2.patch
Type: text/x-patch
Size: 7503 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130829/1caf37c1/attachment.bin>


More information about the llvm-commits mailing list