[PATCH] Add getUnrollingPreferences to TTI
Hal Finkel
hfinkel at anl.gov
Wed Sep 11 12:30:28 PDT 2013
----- Original Message -----
>
> On Sep 6, 2013, at 1:43 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> > Chandler, Andy,
> >
> > I've attached an updated patch.
> >
> > - I've not changed the interface to return the structure by value
> > (because, as noted previously, I think it is better this way).
> >
> > - I've not changed the formatting (given the formatting of the
> > surrounding code, I think that the indentation chosen makes the
> > most sense). 'clang-formatting'ing the whole file looks like it
> > will provide an overall improvement, but it will be a large
> > change, and we should do that in a separate commit.
> >
> > - I've added more explanatory text to the interface definition.
> >
> > - I've switched around the conditionals as requested by Andy.
> >
> > Please review.
>
> LGTM. Thanks.
r190542, thanks!
> -Andy
>
> > Thanks again,
> > Hal
> >
> > ----- Original Message -----
> >> ----- Original Message -----
> >>>
> >>> On Thu, Aug 29, 2013 at 5:54 AM, Hal Finkel < hfinkel at anl.gov >
> >>> wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> ----- Original Message -----
> >>>>> On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:
> >>>>>> Nadav, et al.,
> >>>>>>
> >>>>>> The attached patch adds the following interface to TTI:
> >>>
> >>> I've attached an updated patch. In this version, the caller
> >>> always
> >>> initializes the UP structure to the current target-independent
> >>> defaults before calling getUnrollingPreferences. This simplifies
> >>> the
> >>> implementation, and also stabilizes the interface (target code
> >>> will
> >>> not need to be updated just because new fields are added).
> >>>
> >>>
> >>>
> >>> So, I'd like Andy to chime in on whether he's comfortable with
> >>> this
> >>> use of TTI. I think I'm OK with it, but it's a borderline case.
> >>>
> >>>
> >>> Second, I think essentially only the LoopVectorizer should use
> >>> this
> >>> to drive its partial unrolling. This is both because it has added
> >>> information about when the cost is low (ILP) and when the cost is
> >>> unusually high. Finally, it should happen at the same phase of
> >>> the
> >>> optimizer IMO (IE, it shouldn't drastically influence the
> >>> behavior
> >>> of the inliner). While the LoopVectorizer is currently inside the
> >>> CGSCC pass pipeline, Nadav is working on moving it to live after
> >>> that completes.
> >>
> >> The loop unroller's partial unrolling is different from the loop
> >> vectorizer's partial unrolling. The loop vectorizer specifically
> >> partially unrolls vectorizable loops for ILP (with loop iteration
> >> bodies maximally intermixed), while the loop unroller uses
> >> loop-body
> >> concatenation. Both are important:
> >>
> >> - On cores with high loop-branch overhead, partial unrolling is
> >> important for all loops with small bodies, regardless of ILP
> >> considerations in order to hide the backedge cost.
> >>
> >> - Because the loop vectorizer does not partially unroll
> >> non-vectorizable loops, having only that would miss a lot of
> >> important cases where loop-body concatenation can nevertheless
> >> help
> >> hide instruction latency and expose ILP near the end and beginning
> >> of the loop body. Especially for cores with deep pipelines (high
> >> instruction latency) this is actually quite important.
> >>
> >> In short, the ILP unrolling that the loop vectorizer does is
> >> almost
> >> always preferable (so long as its register pressure estimates are
> >> good), but it is not always possible, and the concatenation-based
> >> unrolling is also important.
> >>
> >>>
> >>>
> >>> Some comments on the patch:
> >>>
> >>> TargetTransformInfo.h
> >>> index 06810a7..8a51266 100644
> >>> --- a/include/llvm/Analysis/TargetTransformInfo.h
> >>> +++ b/include/llvm/Analysis/TargetTransformInfo.h
> >>> @@ -29,6 +29,7 @@
> >>> namespace llvm {
> >>>
> >>> class GlobalValue;
> >>> +class Loop;
> >>> class Type;
> >>> class User;
> >>> class Value;
> >>> @@ -191,6 +192,23 @@ public:
> >>> /// incurs significant execution cost.
> >>> virtual bool isLoweredToCall(const Function *F) const;
> >>>
> >>> + /// Parameters that control the generic loop unrolling
> >>> transformation.
> >>> + struct UnrollingPreferences {
> >>> + unsigned Threshold; ///< The cost threshold for the unrolled
> >>> loop
> >>> + ///< (set to UINT_MAX to disable).
> >>> + unsigned OptSizeThreshold; ///< The cost threshold for the
> >>> unrolled
> >>> loop
> >>> + ///< when optimizing for size
> >>> + ///< (set to UINT_MAX to disable).
> >>> + unsigned Count; ///< Force this unrolling count (set to 0 to
> >>> disable).
> >>>
> >>>
> >>> I don't understand the Count at all. Can you explain this more? I
> >>> suspect using normal doxygen comments rather than trailing
> >>> comments
> >>> will let you use the prose necessary to document this.
> >>
> >> This provides a way to override the unrolling factor for the loop.
> >> As
> >> Tom suggested, the Loop object is provided to the callback, and
> >> based on some target specific analysis, a specific unrolling
> >> factor
> >> could be determined. I'll add more commentary.
> >>
> >>>
> >>>
> >>> + bool Partial; ///< Allow partial loop unrolling.
> >>> + bool Runtime; ///< Perform runtime unrolling.
> >>> + };
> >>> +
> >>> + /// \brief Get target-customized preferences for the generic
> >>> loop
> >>> unrolling
> >>> + /// transformation. The caller will initialize UP with the
> >>> current
> >>> + /// target-independent defaults.
> >>> + virtual void getUnrollingPreferences(Loop *L,
> >>> UnrollingPreferences
> >>> &UP) const;
> >>>
> >>>
> >>> Just return the struct by value. It's a much simpler API that
> >>> way.
> >>
> >> I disagree. In the current patch, the struct members are
> >> initialized
> >> to the current defaults by the unrolling pass. This way the target
> >> only needs to change those elements it cares about, and we don't
> >> need to update all targets that use the interface if we happen to
> >> add another structure member.
> >>
> >>>
> >>>
> >>> +
> >>> /// @}
> >>>
> >>> /// \name Scalar Target Information
> >>> diff --git a/lib/Analysis/TargetTransformInfo.cpp
> >>> b/lib/Analysis/TargetTransformInfo.cpp
> >>> index 8c6005b..b23223d 100644
> >>> --- a/lib/Analysis/TargetTransformInfo.cpp
> >>> +++ b/lib/Analysis/TargetTransformInfo.cpp
> >>> @@ -96,6 +96,11 @@ bool
> >>> TargetTransformInfo::isLoweredToCall(const
> >>> Function *F) const {
> >>> return PrevTTI->isLoweredToCall(F);
> >>> }
> >>>
> >>> +void TargetTransformInfo::getUnrollingPreferences(Loop *L,
> >>> + UnrollingPreferences &UP) const {
> >>>
> >>>
> >>> Can you run the patch through clang-format to fix up weird
> >>> indentation like this?
> >>
> >> Hrmm... I did not think it was weird, actually it was quite
> >> intentional. The second line is indented 2 spaces from the
> >> function
> >> name on the previous line. Given all of the alternatives, that
> >> seemed to make the most sense to me. I'll see what clang-format
> >> does, maybe I'll change my mind and like it better ;)
> >>
> >>>
> >>>
> >>> + PrevTTI->getUnrollingPreferences(L, UP);
> >>> +}
> >>> +
> >>> bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const
> >>> {
> >>> return PrevTTI->isLegalAddImmediate(Imm);
> >>> }
> >>> @@ -469,6 +474,8 @@ struct NoTTI : ImmutablePass,
> >>> TargetTransformInfo
> >>> {
> >>> return true;
> >>> }
> >>>
> >>> + void getUnrollingPreferences(Loop *, UnrollingPreferences &)
> >>> const
> >>> { }
> >>>
> >>>
> >>> This doesn't look like it initializes anything?
> >>
> >> Because the default implementation does not change anything
> >> (because,
> >> as documented in the header file, the structure is pre-initialized
> >> with the target-independent defaults).
> >>
> >> Thanks for the review!
> >>
> >> -Hal
> >>
> >>>
> >>>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory<unroll_tti-v3.patch>
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list