[PATCH] Add getUnrollingPreferences to TTI

Andrew Trick atrick at apple.com
Thu Aug 29 18:43:56 PDT 2013


On Aug 29, 2013, at 5:54 AM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
>> ----- Original Message -----
>>> On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:
>>>> Nadav, et al.,
>>>> 
>>>> The attached patch adds the following interface to TTI:
> 
> I've attached an updated patch. In this version, the caller always initializes the UP structure to the current target-independent defaults before calling getUnrollingPreferences. This simplifies the implementation, and also stabilizes the interface (target code will not need to be updated just because new fields are added).

I misread this initially:

+  unsigned Count = !UserCount ? UP.Count : CurrentCount;

It should be:

>  unsigned Count = UserCount ? CurrentCount : UP.Count;

-Andy

> 
>>>> 
>>>> /// Parameters that control the generic loop unrolling
>>>> transformation.
>>>> struct UnrollingPreferences {
>>>>  unsigned Threshold; ///< The cost threshold for the unrolled
>>>>  loop.
>>>>  unsigned OptSizeThreshold; ///< The cost threshold for the
>>>>  unrolled loop
>>>>                             ///< when optimizing for size.
>>>>  bool     Partial;   ///< Allow partial loop unrolling.
>>>>  bool     Runtime;   ///< Perform runtime unrolling.
>>>> };
>>>> 
>>>> /// \brief Get target-customized preferences for the generic loop
>>>> unrolling
>>>> /// transformation. Returns true if the UnrollingPreferences
>>>> struct
>>>> has been
>>>> /// initialized.
>>>> virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
>>>> const;
>>>> 
>>>> I'd like to use this in the PowerPC backend when targeting the A2
>>>> core. For this target, using more aggressive unrolling helps a
>>>> lot
>>>> because it is in-order with a deep pipeline. As a result,
>>>> unrolling is important for hiding instruction latency and branch
>>>> overhead (especially when combined with the -enable-aa-sched-mi
>>>> functionality).
>>>> 
>>>> I discussed this briefly with Chandler on IRC, and he expressed
>>>> the
>>>> opinion that changing the unrolling factor does not really change
>>>> the canonical form (and TTI is already used to calculate the loop
>>>> body costs), and so this seems like an appropriate use of TTI.
>>>> 
>>>> Please review.
>>>> 
>>>> Thanks again,
>>>> Hal
>>>> 
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>> 
>>>> diff --git a/include/llvm/Analysis/TargetTransformInfo.h
>>>> b/include/llvm/Analysis/TargetTransformInfo.h
>>>> index 06810a7..908612c 100644
>>>> --- a/include/llvm/Analysis/TargetTransformInfo.h
>>>> +++ b/include/llvm/Analysis/TargetTransformInfo.h
>>>> @@ -191,6 +191,20 @@ public:
>>>>   /// incurs significant execution cost.
>>>>   virtual bool isLoweredToCall(const Function *F) const;
>>>> 
>>>> +  /// Parameters that control the generic loop unrolling
>>>> transformation.
>>>> +  struct UnrollingPreferences {
>>>> +    unsigned Threshold; ///< The cost threshold for the unrolled
>>>> loop.
>>>> +    unsigned OptSizeThreshold; ///< The cost threshold for the
>>>> unrolled loop
>>>> +                               ///< when optimizing for size.
>>>> +    bool     Partial;   ///< Allow partial loop unrolling.
>>>> +    bool     Runtime;   ///< Perform runtime unrolling.
>>>> +  };
>>>> +
>>>> +  /// \brief Get target-customized preferences for the generic
>>>> loop unrolling
>>>> +  /// transformation. Returns true if the UnrollingPreferences
>>>> struct has been
>>>> +  /// initialized.
>>>> +  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
>>>> const;
>>>> +
>>> 
>>> Would it be possible to pass a Loop object to this function?  It
>>> would
>>> be useful for the R600 target at least to be able to inspect the
>>> loop
>>> to
>>> decide what threshold to use.
>>> 
>>> For example, it would probably be best for R600 not to unroll loops
>>> with
>>> barrier intrinsics, since there are several restrictions on the
>>> control
>>> flow constructs in which barriers can be used.  On the other hand,
>>> with
>>> loops that index an array in the private address space, R600 may
>>> want
>>> to use a higher than normal threshold if it makes it possible for
>>> SROA
>>> to move the entire array to registers.
>> 
>> I don't see why not ;) --  I've attached a revised patch with this
>> feature (and also with the ability to override the default unrolling
>> count). How's this?
>> 
>> -Hal
>> 
>>> 
>>> -Tom
>>> 
>>> 
>>>>   /// @}
>>>> 
>>>>   /// \name Scalar Target Information
>>>> diff --git a/lib/Analysis/TargetTransformInfo.cpp
>>>> b/lib/Analysis/TargetTransformInfo.cpp
>>>> index 0a215aa..8a443cc 100644
>>>> --- a/lib/Analysis/TargetTransformInfo.cpp
>>>> +++ b/lib/Analysis/TargetTransformInfo.cpp
>>>> @@ -96,6 +96,11 @@ bool
>>>> TargetTransformInfo::isLoweredToCall(const
>>>> Function *F) const {
>>>>   return PrevTTI->isLoweredToCall(F);
>>>> }
>>>> 
>>>> +bool TargetTransformInfo::getUnrollingPreferences(
>>>> +                            UnrollingPreferences &UP) const {
>>>> +  return PrevTTI->getUnrollingPreferences(UP);
>>>> +}
>>>> +
>>>> bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const
>>>> {
>>>>   return PrevTTI->isLegalAddImmediate(Imm);
>>>> }
>>>> @@ -461,6 +466,10 @@ struct NoTTI : ImmutablePass,
>>>> TargetTransformInfo {
>>>>     return true;
>>>>   }
>>>> 
>>>> +  virtual bool getUnrollingPreferences(UnrollingPreferences &)
>>>> const {
>>>> +    return false;
>>>> +  }
>>>> +
>>>>   bool isLegalAddImmediate(int64_t Imm) const {
>>>>     return false;
>>>>   }
>>>> diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> b/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> index d5340e6..e1380b7 100644
>>>> --- a/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> +++ b/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> @@ -84,6 +84,7 @@ public:
>>>>   virtual unsigned getJumpBufSize() const;
>>>>   virtual bool shouldBuildLookupTables() const;
>>>>   virtual bool haveFastSqrt(Type *Ty) const;
>>>> +  virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
>>>> const;
>>>> 
>>>>   /// @}
>>>> 
>>>> @@ -189,6 +190,10 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const
>>>> {
>>>>   return TLI->isTypeLegal(VT) &&
>>>>   TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);
>>>> }
>>>> 
>>>> +bool BasicTTI::getUnrollingPreferences(UnrollingPreferences &)
>>>> const {
>>>> +  return false;
>>>> +}
>>>> +
>>>> //===----------------------------------------------------------------------===//
>>>> //
>>>> // Calls used by the vectorizers.
>>>> diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> b/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> index 80d060b..f8ff275 100644
>>>> --- a/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> +++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> @@ -55,6 +55,8 @@ namespace {
>>>>       CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
>>>>       (bool)P;
>>>> 
>>>>       UserThreshold = (T != -1) ||
>>>>       (UnrollThreshold.getNumOccurrences() > 0);
>>>> +      UserAllowPartial = (P != -1) ||
>>>> +                         (UnrollAllowPartial.getNumOccurrences()
>>>>> 
>>>> 0);
>>>> 
>>>>       initializeLoopUnrollPass(*PassRegistry::getPassRegistry());
>>>>     }
>>>> @@ -76,6 +78,7 @@ namespace {
>>>>     unsigned CurrentThreshold;
>>>>     bool     CurrentAllowPartial;
>>>>     bool     UserThreshold;        // CurrentThreshold is
>>>>     user-specified.
>>>> +    bool     UserAllowPartial;     // CurrentAllowPartial is
>>>> user-specified.
>>>> 
>>>>     bool runOnLoop(Loop *L, LPPassManager &LPM);
>>>> 
>>>> @@ -145,16 +148,20 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>>         << "] Loop %" << Header->getName() << "\n");
>>>>   (void)Header;
>>>> 
>>>> +  TargetTransformInfo::UnrollingPreferences UP;
>>>> +  bool HasUP = TTI.getUnrollingPreferences(UP);
>>>> +
>>>>   // Determine the current unrolling threshold.  While this is
>>>>   normally set
>>>>   // from UnrollThreshold, it is overridden to a smaller value
>>>>   if
>>>>   the current
>>>>   // function is marked as optimize-for-size, and the unroll
>>>>   threshold was
>>>>   // not user specified.
>>>> -  unsigned Threshold = CurrentThreshold;
>>>> +  unsigned Threshold = (HasUP && !UserThreshold) ? UP.Threshold
>>>> :
>>>> +
>>>>                                                  CurrentThreshold;
>>>>   if (!UserThreshold &&
>>>>       Header->getParent()->getAttributes().
>>>>         hasAttribute(AttributeSet::FunctionIndex,
>>>>                      Attribute::OptimizeForSize))
>>>> -    Threshold = OptSizeUnrollThreshold;
>>>> +    Threshold = HasUP ? UP.OptSizeThreshold :
>>>> OptSizeUnrollThreshold;
>>>> 
>>>>   // Find trip count and trip multiple if count is not available
>>>>   unsigned TripCount = 0;
>>>> @@ -184,6 +191,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>>     Count = TripCount;
>>>>   }
>>>> 
>>>> +  bool Runtime = (HasUP && UnrollRuntime.getNumOccurrences() ==
>>>> 0)
>>>> ?
>>>> +                 UP.Runtime : UnrollRuntime;
>>>> +
>>>>   // Enforce the threshold.
>>>>   if (Threshold != NoThreshold) {
>>>>     unsigned NumInlineCandidates;
>>>> @@ -204,7 +214,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>>     if (TripCount != 1 && Size > Threshold) {
>>>>       DEBUG(dbgs() << "  Too large to fully unroll with count: "
>>>>       << Count
>>>>             << " because size: " << Size << ">" << Threshold <<
>>>>             "\n");
>>>> -      if (!CurrentAllowPartial && !(UnrollRuntime && TripCount
>>>> ==
>>>> 0)) {
>>>> +      bool AllowPartial = (HasUP && !UserAllowPartial) ?
>>>> UP.Partial :
>>>> +
>>>>                                                        CurrentAllowPartial;
>>>> +      if (!AllowPartial && !(Runtime && TripCount == 0)) {
>>>>         DEBUG(dbgs() << "  will not try to unroll partially
>>>>         because "
>>>>               << "-unroll-allow-partial not given\n");
>>>>         return false;
>>>> @@ -215,7 +227,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>>         while (Count != 0 && TripCount%Count != 0)
>>>>           Count--;
>>>>       }
>>>> -      else if (UnrollRuntime) {
>>>> +      else if (Runtime) {
>>>>         // Reduce unroll count to be a lower power-of-two value
>>>>         while (Count != 0 && Size > Threshold) {
>>>>           Count >>= 1;
>>>> @@ -231,7 +243,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>>   }
>>>> 
>>>>   // Unroll the loop.
>>>> -  if (!UnrollLoop(L, Count, TripCount, UnrollRuntime,
>>>> TripMultiple, LI, &LPM))
>>>> +  if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple,
>>>> LI,
>>>> &LPM))
>>>>     return false;
>>>> 
>>>>   return true;
>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory<unroll_tti-v3.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130829/82003678/attachment.html>


More information about the llvm-commits mailing list