[llvm-dev] [RFC] Polly Status and Integration

Fri Sep 1 12:18:36 PDT 2017

On Fri, Sep 1, 2017 at 11:47 AM, Hal Finkel via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Hi everyone, As you may know, stock LLVM does not provide the kind of
> advanced loop transformations necessary to provide good performance on many
> applications. LLVM's Polly project provides many of the required
> capabilities, including loop transformations such as fission, fusion,
> skewing, blocking/tiling, and interchange, all powered by state-of-the-art
> dependence analysis. Polly also provides automated parallelization and
> targeting of GPUs and other accelerators.
>
>
> Over the past year, Polly’s development has focused on robustness,
> correctness, and closer integration with LLVM. To highlight a few
> accomplishments:
>
>
> Polly now runs, by default, in the conceptually-proper place in LLVM’s pass
> pipeline (just before the loop vectorizer). Importantly, this means that its
> loop transformations are performed after inlining and other
> canonicalization, greatly increasing its robustness, and enabling its use on
> C++ code (where [] is often a function call before inlining).
>
> Polly’s cost-modeling parameters, such as those describing the target’s
> memory hierarchy, are being integrated with TargetTransformInfo. This allows
> targets to properly override the modeling parameters and allows reuse of
> these parameters by other clients.
>
> Polly’s method of handling signed division/remainder operations, which
> worked around lack of support in ScalarEvolution, is being replaced thanks
> to improvements being contributed to ScalarEvolution itself (see D34598).
> Polly’s core delinearization routines have long been a part of LLVM itself.
>
> PolyhedralInfo, which exposes a subset of Polly’s loop analysis for use by
> other clients, is now available.
>
> Polly is now part of the LLVM release process and is being included with
> LLVM by various packagers (e.g., Debian).
>
>
> I believe that the LLVM community would benefit from beginning the process
> of integrating Polly with LLVM itself and continuing its development as part
> of our main code base. This will:
>
> Allow for wider adoption of LLVM within communities relying on advanced loop
> transformations.
>
> Provide for better community feedback on, and testing of, the code developed
> (although the story in this regard is already fairly solid).
>
> Better motivate targets to provide accurate, comprehensive, modeling
> parameters for use by advanced loop transformations.
>
> Perhaps most importantly, this will allow us to develop and tune the rest of
> the optimizer assuming that Polly’s capabilities are present (the underlying
> analysis, and eventually, the transformations themselves).
>
>
> The largest issue on which community consensus is required, in order to move
> forward at all, is what to do with isl. isl, the Integer Set Library,
> provides core functionality on which Polly depends. It is a C library, and
> while some Polly/LLVM developers are also isl developers, it has a large
> user community outside of LLVM/Polly. A C++ interface was recently added,
> and Polly is transitioning to use the C++ interface. Nevertheless, options
> here include rewriting the needed functionality, forking isl and
> transitioning our fork toward LLVM coding conventions (and data structures)
> over time, and incorporating isl more-or-less as-is to avoid partitioning
> its development.
>
>
> That having been said, isl is internally modular, and regardless of the
> overall integration strategy, the Polly developers anticipate specializing,
> or even replacing, some of these components with LLVM-specific solutions.
> This is especially true for anything that touches performance-related
> heuristics and modeling. LLVM-specific, or even target-specific, loop
> schedulers may be developed as well.
>
>
> Even though some developers in the LLVM community already have a background
> in polyhedral-modeling techniques, the Polly developers have developed, and
> are still developing, extensive tutorials on this topic
> http://pollylabs.org/education.html and especially
> http://playground.pollylabs.org.
>
>
> Finally, let me highlight a few ongoing development efforts in Polly that
> are potentially relevant to this discussion. Polly’s loop analysis is sound
> and technically superior to what’s in LLVM currently (i.e. in
> LoopAccessAnalysis and DependenceAnalysis). There are, however, two known
> reasons why Polly’s transformations could not yet be enabled by default:
>
> A correctness issue: Currently, Polly assumes that 64 bits is large enough
> for all new loop-induction variables and index expressions. In rare cases,
> transformations could be performed where more bits are required.
> Preconditions need to be generated preventing this (e.g., D35471).
>
> A performance issue: Polly currently models temporal locality (i.e., it
> tries to get better reuse in time), but does not model spatial locality
> (i.e., it does not model cache-line reuse). As a result, it can sometimes
> introduce performance regressions. Polly Labs is currently working on
> integrating spatial locality modeling into the loop optimization model.
>
> Polly can already split apart basic blocks in order to implement loop
> fusion. Heuristics to choose at which granularity are still being
> implemented (e.g., PR12402).
>
> I believe that we can now develop a concrete plan for moving
> state-of-the-art loop optimizations, based on the technology in the Polly
> project, into LLVM. Doing so will enable LLVM to be competitive with
> proprietary compilers in high-performance computing, machine learning, and
> other important application domains. I'd like community feedback on what
> should be part of that plan.
>
>

This, at least on paper, sounds great. I think LLVM could greatly
benefit from this informations for some applications.
I have a couple of questions:
1) I'm aware there have been attempts in the past to make polyhedral
value informations available to LLVM, but they've been unsuccessful.
Do you plan to develop new LLVM passes to overcome this issue?
2) As far as I can tell (last I tried, but that was a while ago),
polly had a significant compile time impact. Do you plan to introduce
a new -Opolly pipeline?

On the ISL story. I think it would be better to have polly being
self-contained (with LLVM implementing the needed functionality), but
I understand that's a major undertaken and it comes at a cost (i.e.
LLVM will have to maintain this forked/reimplemented library forever
instead of leveraging upstream work). LLVM tries to minimize the
amount of required dependencies, but it seems it's OK to add new
external optional deps (recently, Z3), so that could be an OK path
forward.
I don't have a strong opinion on what's the best solution here, FWIW.

Thanks,

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare