[PATCH] Add getUnrollingPreferences to TTI
Andrew Trick
atrick at apple.com
Thu Aug 29 18:43:56 PDT 2013
On Aug 29, 2013, at 5:54 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> ----- Original Message -----
>>> On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:
>>>> Nadav, et al.,
>>>>
>>>> The attached patch adds the following interface to TTI:
>
> I've attached an updated patch. In this version, the caller always initializes the UP structure to the current target-independent defaults before calling getUnrollingPreferences. This simplifies the implementation, and also stabilizes the interface (target code will not need to be updated just because new fields are added).
I misread this initially:
+ unsigned Count = !UserCount ? UP.Count : CurrentCount;
It should be:
> unsigned Count = UserCount ? CurrentCount : UP.Count;
-Andy
>
>>>>
>>>> /// Parameters that control the generic loop unrolling
>>>> transformation.
>>>> struct UnrollingPreferences {
>>>> unsigned Threshold; ///< The cost threshold for the unrolled
>>>> loop.
>>>> unsigned OptSizeThreshold; ///< The cost threshold for the
>>>> unrolled loop
>>>> ///< when optimizing for size.
>>>> bool Partial; ///< Allow partial loop unrolling.
>>>> bool Runtime; ///< Perform runtime unrolling.
>>>> };
>>>>
>>>> /// \brief Get target-customized preferences for the generic loop
>>>> unrolling
>>>> /// transformation. Returns true if the UnrollingPreferences
>>>> struct
>>>> has been
>>>> /// initialized.
>>>> virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
>>>> const;
>>>>
>>>> I'd like to use this in the PowerPC backend when targeting the A2
>>>> core. For this target, using more aggressive unrolling helps a
>>>> lot
>>>> because it is in-order with a deep pipeline. As a result,
>>>> unrolling is important for hiding instruction latency and branch
>>>> overhead (especially when combined with the -enable-aa-sched-mi
>>>> functionality).
>>>>
>>>> I discussed this briefly with Chandler on IRC, and he expressed
>>>> the
>>>> opinion that changing the unrolling factor does not really change
>>>> the canonical form (and TTI is already used to calculate the loop
>>>> body costs), and so this seems like an appropriate use of TTI.
>>>>
>>>> Please review.
>>>>
>>>> Thanks again,
>>>> Hal
>>>>
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>
>>>> diff --git a/include/llvm/Analysis/TargetTransformInfo.h
>>>> b/include/llvm/Analysis/TargetTransformInfo.h
>>>> index 06810a7..908612c 100644
>>>> --- a/include/llvm/Analysis/TargetTransformInfo.h
>>>> +++ b/include/llvm/Analysis/TargetTransformInfo.h
>>>> @@ -191,6 +191,20 @@ public:
>>>> /// incurs significant execution cost.
>>>> virtual bool isLoweredToCall(const Function *F) const;
>>>>
>>>> + /// Parameters that control the generic loop unrolling
>>>> transformation.
>>>> + struct UnrollingPreferences {
>>>> + unsigned Threshold; ///< The cost threshold for the unrolled
>>>> loop.
>>>> + unsigned OptSizeThreshold; ///< The cost threshold for the
>>>> unrolled loop
>>>> + ///< when optimizing for size.
>>>> + bool Partial; ///< Allow partial loop unrolling.
>>>> + bool Runtime; ///< Perform runtime unrolling.
>>>> + };
>>>> +
>>>> + /// \brief Get target-customized preferences for the generic
>>>> loop unrolling
>>>> + /// transformation. Returns true if the UnrollingPreferences
>>>> struct has been
>>>> + /// initialized.
>>>> + virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
>>>> const;
>>>> +
>>>
>>> Would it be possible to pass a Loop object to this function? It
>>> would
>>> be useful for the R600 target at least to be able to inspect the
>>> loop
>>> to
>>> decide what threshold to use.
>>>
>>> For example, it would probably be best for R600 not to unroll loops
>>> with
>>> barrier intrinsics, since there are several restrictions on the
>>> control
>>> flow constructs in which barriers can be used. On the other hand,
>>> with
>>> loops that index an array in the private address space, R600 may
>>> want
>>> to use a higher than normal threshold if it makes it possible for
>>> SROA
>>> to move the entire array to registers.
>>
>> I don't see why not ;) -- I've attached a revised patch with this
>> feature (and also with the ability to override the default unrolling
>> count). How's this?
>>
>> -Hal
>>
>>>
>>> -Tom
>>>
>>>
>>>> /// @}
>>>>
>>>> /// \name Scalar Target Information
>>>> diff --git a/lib/Analysis/TargetTransformInfo.cpp
>>>> b/lib/Analysis/TargetTransformInfo.cpp
>>>> index 0a215aa..8a443cc 100644
>>>> --- a/lib/Analysis/TargetTransformInfo.cpp
>>>> +++ b/lib/Analysis/TargetTransformInfo.cpp
>>>> @@ -96,6 +96,11 @@ bool
>>>> TargetTransformInfo::isLoweredToCall(const
>>>> Function *F) const {
>>>> return PrevTTI->isLoweredToCall(F);
>>>> }
>>>>
>>>> +bool TargetTransformInfo::getUnrollingPreferences(
>>>> + UnrollingPreferences &UP) const {
>>>> + return PrevTTI->getUnrollingPreferences(UP);
>>>> +}
>>>> +
>>>> bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const
>>>> {
>>>> return PrevTTI->isLegalAddImmediate(Imm);
>>>> }
>>>> @@ -461,6 +466,10 @@ struct NoTTI : ImmutablePass,
>>>> TargetTransformInfo {
>>>> return true;
>>>> }
>>>>
>>>> + virtual bool getUnrollingPreferences(UnrollingPreferences &)
>>>> const {
>>>> + return false;
>>>> + }
>>>> +
>>>> bool isLegalAddImmediate(int64_t Imm) const {
>>>> return false;
>>>> }
>>>> diff --git a/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> b/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> index d5340e6..e1380b7 100644
>>>> --- a/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> +++ b/lib/CodeGen/BasicTargetTransformInfo.cpp
>>>> @@ -84,6 +84,7 @@ public:
>>>> virtual unsigned getJumpBufSize() const;
>>>> virtual bool shouldBuildLookupTables() const;
>>>> virtual bool haveFastSqrt(Type *Ty) const;
>>>> + virtual bool getUnrollingPreferences(UnrollingPreferences &UP)
>>>> const;
>>>>
>>>> /// @}
>>>>
>>>> @@ -189,6 +190,10 @@ bool BasicTTI::haveFastSqrt(Type *Ty) const
>>>> {
>>>> return TLI->isTypeLegal(VT) &&
>>>> TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);
>>>> }
>>>>
>>>> +bool BasicTTI::getUnrollingPreferences(UnrollingPreferences &)
>>>> const {
>>>> + return false;
>>>> +}
>>>> +
>>>> //===----------------------------------------------------------------------===//
>>>> //
>>>> // Calls used by the vectorizers.
>>>> diff --git a/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> b/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> index 80d060b..f8ff275 100644
>>>> --- a/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> +++ b/lib/Transforms/Scalar/LoopUnrollPass.cpp
>>>> @@ -55,6 +55,8 @@ namespace {
>>>> CurrentAllowPartial = (P == -1) ? UnrollAllowPartial :
>>>> (bool)P;
>>>>
>>>> UserThreshold = (T != -1) ||
>>>> (UnrollThreshold.getNumOccurrences() > 0);
>>>> + UserAllowPartial = (P != -1) ||
>>>> + (UnrollAllowPartial.getNumOccurrences()
>>>>>
>>>> 0);
>>>>
>>>> initializeLoopUnrollPass(*PassRegistry::getPassRegistry());
>>>> }
>>>> @@ -76,6 +78,7 @@ namespace {
>>>> unsigned CurrentThreshold;
>>>> bool CurrentAllowPartial;
>>>> bool UserThreshold; // CurrentThreshold is
>>>> user-specified.
>>>> + bool UserAllowPartial; // CurrentAllowPartial is
>>>> user-specified.
>>>>
>>>> bool runOnLoop(Loop *L, LPPassManager &LPM);
>>>>
>>>> @@ -145,16 +148,20 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>> << "] Loop %" << Header->getName() << "\n");
>>>> (void)Header;
>>>>
>>>> + TargetTransformInfo::UnrollingPreferences UP;
>>>> + bool HasUP = TTI.getUnrollingPreferences(UP);
>>>> +
>>>> // Determine the current unrolling threshold. While this is
>>>> normally set
>>>> // from UnrollThreshold, it is overridden to a smaller value
>>>> if
>>>> the current
>>>> // function is marked as optimize-for-size, and the unroll
>>>> threshold was
>>>> // not user specified.
>>>> - unsigned Threshold = CurrentThreshold;
>>>> + unsigned Threshold = (HasUP && !UserThreshold) ? UP.Threshold
>>>> :
>>>> +
>>>> CurrentThreshold;
>>>> if (!UserThreshold &&
>>>> Header->getParent()->getAttributes().
>>>> hasAttribute(AttributeSet::FunctionIndex,
>>>> Attribute::OptimizeForSize))
>>>> - Threshold = OptSizeUnrollThreshold;
>>>> + Threshold = HasUP ? UP.OptSizeThreshold :
>>>> OptSizeUnrollThreshold;
>>>>
>>>> // Find trip count and trip multiple if count is not available
>>>> unsigned TripCount = 0;
>>>> @@ -184,6 +191,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>> Count = TripCount;
>>>> }
>>>>
>>>> + bool Runtime = (HasUP && UnrollRuntime.getNumOccurrences() ==
>>>> 0)
>>>> ?
>>>> + UP.Runtime : UnrollRuntime;
>>>> +
>>>> // Enforce the threshold.
>>>> if (Threshold != NoThreshold) {
>>>> unsigned NumInlineCandidates;
>>>> @@ -204,7 +214,9 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>> if (TripCount != 1 && Size > Threshold) {
>>>> DEBUG(dbgs() << " Too large to fully unroll with count: "
>>>> << Count
>>>> << " because size: " << Size << ">" << Threshold <<
>>>> "\n");
>>>> - if (!CurrentAllowPartial && !(UnrollRuntime && TripCount
>>>> ==
>>>> 0)) {
>>>> + bool AllowPartial = (HasUP && !UserAllowPartial) ?
>>>> UP.Partial :
>>>> +
>>>> CurrentAllowPartial;
>>>> + if (!AllowPartial && !(Runtime && TripCount == 0)) {
>>>> DEBUG(dbgs() << " will not try to unroll partially
>>>> because "
>>>> << "-unroll-allow-partial not given\n");
>>>> return false;
>>>> @@ -215,7 +227,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>> while (Count != 0 && TripCount%Count != 0)
>>>> Count--;
>>>> }
>>>> - else if (UnrollRuntime) {
>>>> + else if (Runtime) {
>>>> // Reduce unroll count to be a lower power-of-two value
>>>> while (Count != 0 && Size > Threshold) {
>>>> Count >>= 1;
>>>> @@ -231,7 +243,7 @@ bool LoopUnroll::runOnLoop(Loop *L,
>>>> LPPassManager &LPM) {
>>>> }
>>>>
>>>> // Unroll the loop.
>>>> - if (!UnrollLoop(L, Count, TripCount, UnrollRuntime,
>>>> TripMultiple, LI, &LPM))
>>>> + if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple,
>>>> LI,
>>>> &LPM))
>>>> return false;
>>>>
>>>> return true;
>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory<unroll_tti-v3.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130829/82003678/attachment.html>
More information about the llvm-commits
mailing list