[llvm-dev] [RFC] Polly Status and Integration

Mon Sep 4 22:37:33 PDT 2017

On Mon, Sep 4, 2017, at 22:14, Adve, Vikram Sadanand wrote:
> Responses inline, marked with (***VSA***) because Outlook on a Mac sucks
> for inline replies!
> 
> 
> On 9/4/17, 2:14 PM, "Tobias Grosser" <tobias.grosser at inf.ethz.ch> wrote:
> 
>     On Mon, Sep 4, 2017, at 20:49, Hal Finkel via llvm-dev wrote:
>     > [tying to original thread]
>     > 
>     > On 09/04/2017 01:37 PM, Adve, Vikram Sadanand via llvm-dev wrote:
>     > > Hal, Tobias, et al. –
>     > >
>     > > I am strongly in favor of seeing a broader range of loop transformations, supported by strong dependence analysis, added to LLVM, and the Polly infrastructure seems to be by far our best bet to make that happen.  I have a couple of questions:
>     > >
>     > > 1) Integer constraint libraries like ISL (and Omega, which I used extensively in a previous project) are fundamentally solving exponential problems or worse, and can in rare cases cause compile times to explode.  How does Polly deal with this issue?
>     
>     Our integer set library has a compute out functionality. It counts
>     expensive operations such as memory allocations but also so-called
>     pivots which occur during ilp solving. In case a certain limit is
>     reached the library skips all subsequent operations and returns a
>     "null"
>     result to the user. We guard potentially expensive parts of Polly
>     (scheduling, dependence analysis) with these compute outs, and bail
>     out
>     gracefully. The important point here is that these are compute-outs
>     not
>     time-outs. Hence, the resulting binaries will be identical across
>     system
>     with differing performance (and obviously also across runs).
> 
> 
> (***VSA***) Thanks, Tobias.  This is exactly what I was looking for. 
> Using compute-outs is definitely much better than timeouts (but are
> probably more intrusive and cannot have been easy to implement, so kudos
> on that).

Thanks!

>     > > 2) Are the loop transformations composable through a reasonable API, i.e., can a tool create an arbitrary pipeline of the transformations?  I know about the Polly scheduling work, which explores flexible combinations of these, but I don’t know whether there are interfaces other tools can use to create arbitrary, fixed pipelines of such transforms.
>     
>     Loop transformations are transformations on the polyhedral schedule.
>     Tools can set an arbitrary multi-dimensional scheduling function,
>     which
>     allow individual loop iterations to be remapped into a new loop
>     structure. (With your background, this might already be clear). 
> 
> (***VSA***) Yes, that makes sense.  When a client creates such a
> scheduling function, does Polly automatically enforce correctness
> internally, or is the tool responsible for checking that separately?

When you import a schedule, we verify that the dependences are not
violated. However, this was until know mostly used as debugging tool.
More complex transformations (data-layout transformations) are possibly
not yet well covered.

> For
>     everybody who does not yet speak "polyhedral", we generally work on a
>     tree representation of the execution order, which  -- even though not
>     an
>     AST -- can be modified in a similar way, such that loop
>     transformations
>     can be applied on a per-node level. 
> 
> (***VSA***) What granularity can clients transform in this
> representation?  For example, are schedules applicable to individual IR
> instructions?  Or only to entire basic blocks?

Until recently, basic blocks. We are transitioning towards finer grained
statements.
The main road blocker here is the cost of modeling such. We enhanced our
internal scheduler to do incremental scheduling to address this issue.
Traditionally polyhedral schedulers formed a single ILP for scheduling,
where the dimensionality of this ILP
increased with the number of statements scheduled. As most of our
algorithms are
exponential in the number of dimensions, this limited scalability. The
new isl scheduler instead incrementally adds statements and can be
stopped (but is not yet),
when sufficient statements have been fused. With our new approach the
dimensionality of the ILP is bound by the number of statements to fuse.

Tobias