[PATCH] Add getUnrollingPreferences to TTI

Hal Finkel hfinkel at anl.gov
Fri Sep 6 01:43:41 PDT 2013


Chandler, Andy,

I've attached an updated patch.

 - I've not changed the interface to return the structure by value (because, as noted previously, I think it is better this way).

 - I've not changed the formatting (given the formatting of the surrounding code, I think that the indentation chosen makes the most sense). 'clang-formatting'ing the whole file looks like it will provide an overall improvement, but it will be a large change, and we should do that in a separate commit. 

 - I've added more explanatory text to the interface definition.

 - I've switched around the conditionals as requested by Andy.

Please review.

Thanks again,
Hal

----- Original Message -----
> ----- Original Message -----
> > 
> > On Thu, Aug 29, 2013 at 5:54 AM, Hal Finkel < hfinkel at anl.gov >
> > wrote:
> > 
> > 
> > 
> > 
> > 
> > ----- Original Message -----
> > > ----- Original Message -----
> > > > On Wed, Aug 28, 2013 at 03:07:31PM -0500, Hal Finkel wrote:
> > > > > Nadav, et al.,
> > > > > 
> > > > > The attached patch adds the following interface to TTI:
> > 
> > I've attached an updated patch. In this version, the caller always
> > initializes the UP structure to the current target-independent
> > defaults before calling getUnrollingPreferences. This simplifies
> > the
> > implementation, and also stabilizes the interface (target code will
> > not need to be updated just because new fields are added).
> > 
> > 
> > 
> > So, I'd like Andy to chime in on whether he's comfortable with this
> > use of TTI. I think I'm OK with it, but it's a borderline case.
> > 
> > 
> > Second, I think essentially only the LoopVectorizer should use this
> > to drive its partial unrolling. This is both because it has added
> > information about when the cost is low (ILP) and when the cost is
> > unusually high. Finally, it should happen at the same phase of the
> > optimizer IMO (IE, it shouldn't drastically influence the behavior
> > of the inliner). While the LoopVectorizer is currently inside the
> > CGSCC pass pipeline, Nadav is working on moving it to live after
> > that completes.
> 
> The loop unroller's partial unrolling is different from the loop
> vectorizer's partial unrolling. The loop vectorizer specifically
> partially unrolls vectorizable loops for ILP (with loop iteration
> bodies maximally intermixed), while the loop unroller uses loop-body
> concatenation. Both are important:
> 
>  - On cores with high loop-branch overhead, partial unrolling is
>  important for all loops with small bodies, regardless of ILP
>  considerations in order to hide the backedge cost.
> 
>  - Because the loop vectorizer does not partially unroll
>  non-vectorizable loops, having only that would miss a lot of
>  important cases where loop-body concatenation can nevertheless help
>  hide instruction latency and expose ILP near the end and beginning
>  of the loop body. Especially for cores with deep pipelines (high
>  instruction latency) this is actually quite important.
> 
> In short, the ILP unrolling that the loop vectorizer does is almost
> always preferable (so long as its register pressure estimates are
> good), but it is not always possible, and the concatenation-based
> unrolling is also important.
> 
> > 
> > 
> > Some comments on the patch:
> > 
> > TargetTransformInfo.h
> > index 06810a7..8a51266 100644
> > --- a/include/llvm/Analysis/TargetTransformInfo.h
> > +++ b/include/llvm/Analysis/TargetTransformInfo.h
> > @@ -29,6 +29,7 @@
> > namespace llvm {
> > 
> > class GlobalValue;
> > +class Loop;
> > class Type;
> > class User;
> > class Value;
> > @@ -191,6 +192,23 @@ public:
> > /// incurs significant execution cost.
> > virtual bool isLoweredToCall(const Function *F) const;
> > 
> > + /// Parameters that control the generic loop unrolling
> > transformation.
> > + struct UnrollingPreferences {
> > + unsigned Threshold; ///< The cost threshold for the unrolled loop
> > + ///< (set to UINT_MAX to disable).
> > + unsigned OptSizeThreshold; ///< The cost threshold for the
> > unrolled
> > loop
> > + ///< when optimizing for size
> > + ///< (set to UINT_MAX to disable).
> > + unsigned Count; ///< Force this unrolling count (set to 0 to
> > disable).
> > 
> > 
> > I don't understand the Count at all. Can you explain this more? I
> > suspect using normal doxygen comments rather than trailing comments
> > will let you use the prose necessary to document this.
> 
> This provides a way to override the unrolling factor for the loop. As
> Tom suggested, the Loop object is provided to the callback, and
> based on some target specific analysis, a specific unrolling factor
> could be determined. I'll add more commentary.
> 
> > 
> > 
> > + bool Partial; ///< Allow partial loop unrolling.
> > + bool Runtime; ///< Perform runtime unrolling.
> > + };
> > +
> > + /// \brief Get target-customized preferences for the generic loop
> > unrolling
> > + /// transformation. The caller will initialize UP with the
> > current
> > + /// target-independent defaults.
> > + virtual void getUnrollingPreferences(Loop *L,
> > UnrollingPreferences
> > &UP) const;
> > 
> > 
> > Just return the struct by value. It's a much simpler API that way.
> 
> I disagree. In the current patch, the struct members are initialized
> to the current defaults by the unrolling pass. This way the target
> only needs to change those elements it cares about, and we don't
> need to update all targets that use the interface if we happen to
> add another structure member.
> 
> > 
> > 
> > +
> > /// @}
> > 
> > /// \name Scalar Target Information
> > diff --git a/lib/Analysis/TargetTransformInfo.cpp
> > b/lib/Analysis/TargetTransformInfo.cpp
> > index 8c6005b..b23223d 100644
> > --- a/lib/Analysis/TargetTransformInfo.cpp
> > +++ b/lib/Analysis/TargetTransformInfo.cpp
> > @@ -96,6 +96,11 @@ bool TargetTransformInfo::isLoweredToCall(const
> > Function *F) const {
> > return PrevTTI->isLoweredToCall(F);
> > }
> > 
> > +void TargetTransformInfo::getUnrollingPreferences(Loop *L,
> > + UnrollingPreferences &UP) const {
> > 
> > 
> > Can you run the patch through clang-format to fix up weird
> > indentation like this?
> 
> Hrmm... I did not think it was weird, actually it was quite
> intentional. The second line is indented 2 spaces from the function
> name on the previous line. Given all of the alternatives, that
> seemed to make the most sense to me. I'll see what clang-format
> does, maybe I'll change my mind and like it better ;)
> 
> > 
> > 
> > + PrevTTI->getUnrollingPreferences(L, UP);
> > +}
> > +
> > bool TargetTransformInfo::isLegalAddImmediate(int64_t Imm) const {
> > return PrevTTI->isLegalAddImmediate(Imm);
> > }
> > @@ -469,6 +474,8 @@ struct NoTTI : ImmutablePass,
> > TargetTransformInfo
> > {
> > return true;
> > }
> > 
> > + void getUnrollingPreferences(Loop *, UnrollingPreferences &)
> > const
> > { }
> > 
> > 
> > This doesn't look like it initializes anything?
> 
> Because the default implementation does not change anything (because,
> as documented in the header file, the structure is pre-initialized
> with the target-independent defaults).
> 
> Thanks for the review!
> 
>  -Hal
> 
> > 
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unroll_tti-v3.patch
Type: text/x-patch
Size: 8066 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130906/981bc57f/attachment.bin>


More information about the llvm-commits mailing list