[PATCH] Allow BB duplication threshold to be adjusted through JumpThreading's ctor

Tue Sep 30 00:38:42 PDT 2014

----- Original Message -----
> From: "Owen Anderson" <resistor at mac.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: reviews+D5444+public+de6f72cb2e4729d3 at reviews.llvm.org, llvm-commits at cs.uiuc.edu
> Sent: Monday, September 29, 2014 11:34:46 PM
> Subject: Re: [PATCH] Allow BB duplication threshold to be adjusted through JumpThreading's ctor
> 
> 
> 
> 
> On Sep 29, 2014, at 5:10 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> For updating that threshold from TTI, yeah, if we are interested in
> that case. I could come another patch considering both TTI and
> user-specified threshold.
> 
> I suppose that I don't understand what you mean by "if we are
> interested." Generally speaking, ctor parameters are useful only for
> clients who are not using the standard optimization pipeline, and
> we'd like the standard optimization pipeline to generally work well
> for a wide range of targets. Thus, a TTI interface is preferred.
> 
> I strongly disagree with this reasoning. We’re discussing GPU
> compilers, which typically on-line compilers, with sensitive compile
> time constraints. The default pass pipelines are completely
> unsuitable for that use case. Allowing the client to make the
> decision at pass pipeline construction time is very important, since
> only the client knows if it is compiling in a time sensitive
> environment or not.
> 

I understand what you're saying, and in many cases I feel you'd be correct (but not here). Many of our transformations are bundles of many different separable transformations. In some of these cases, like the loop unroller for example, where the transformation occurs either late and/or does not affect canonicalization, allowing for a finer degree of control over which of the bundled transformations a particular pass performs (and, for compile time, how aggressive it might be) makes perfect sense. We currently do this with the loop unroller, vectorizer, etc. and we should do more in this direction.

However, this is really a target cost-modeling parameter, is modeling a trade-off describable at the IR level, and it belongs in the target cost model. Embedding it in a custom pass manager builder does not encourage the correct separation of concerns, but it does introduce another heuristic threshold that each development team will need to separately re-tune whenever JumpThreading's cost function and/or any relevant later lowering changes, and has no principled relationship to compile time (or, currently, much of anything else).

It is true that the loop unroller provides ctor parameters that also affect cost thresholds, but that's a legacy of its pre-TTI design.

 -Hal

> 
> —Owen
> 
> 
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory