[llvm-dev] [RFC] Polly Status and Integration
Tobias Grosser via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 25 23:18:48 PDT 2017
On Wed, Sep 20, 2017, at 02:33, Johannes Doerfert wrote:
> Hi Hal, Tobias, Michael, and others,
thanks for joining the discussion.
> I'd like to add my view (and a proposal) to this discussion and I
> apologize directly for doing this so late*. I also want to apologize
> because this email is long, contains various technical details and also
> argumentations that might need more justification. However, I am happy
> to provide further information (and/or examples) to explain my views if
> We should introduce polyhedral analysis and optimization capabilities
> into LLVM step by step. Along the way we should revisit design decisions
> made by Polly and adjust them according to the use cases in the rest of
> LLVM. Finally, we should keep Polly as a stand-alone tool to allow
> research into, and prototyping of, complex loop transformations.
I very much agree with you that many design decisions will be changed in
the future. Many of the targets you mentioned are indeed goals that
have been discussed as part of the Polly development agenda and are
areas I very much would like to push forward. You also make a very valid
point that we should involve the wider LLVM community in these
discussions and at best in the overall development of polyhedral
optimizations as a whole. Hence, it seems clear that we want to keep as
much of the polyhedral development as possible in core-LLVM. I have
always been conservative in pushing prototype stuff into core-LLVM,
which is why we spent multiple years to develop very solid foundations
in both Polly and isl. While I agree that many design decisions need to
be re-evaluated, as we constantly do in LLVM, and especially in the
light of advances we pushed into isl, I think doing this jointly with
the full LLVM community is the way to go.
I also don't see your ideas to conflict with the integration of Polly.
We can always move Polly now and replace parts of it with new and better
stuff as it becomes available. In fact, Polly can serve as test-bed to
provide test coverage for such new code and me and the other PollyLabs
people are certainly happy to help.
> * There was a paper deadline end of last week and I simply had to
> Polly is a deep pipeline of passes that were developed for the sole
> purpose of applying scheduling transformations at the very end. This
> "main purpose" of Polly makes it hard to effectively reuse intermediate
> parts, e.g. the dependence analysis or the code generation framework, at
> least to the degree we might want to. In a different thread  Hal
> asked for the possibility to reuse Polly for tuning via pragmas e.g.,
> unrolling/interleaving/tiling/interchange. While this is interesting and
> generally possible it won't take long before the restrictions Polly
> places on the input code (to enable scheduling optimizations in the end)
> become a serious problem. The point here is that Polly has too many
> features and dependences for such an "easy" use case. (Note that I do
> not question the legality of a user requested transformation here.) If
> you are not looking for "end-to-end polyhedral loop optimizations" you
> want a system that answers a very specific question (e.g.,
> isVectorizable) or performs a very specific transformation (e.g, tiling)
> on a maximal set of programs. In contrast, Polly is (and I strongly
> believe it should be) developed as a system that performs complex
> optimizations, based on specialized analysis results, on a constraint
> type of programs.
While it is right that Polly has been developed as end-to-end polyhedral
optimizer, it certainly has not been developed only with end-to-end loop
optimizations in mind. In fact, many of the foundations that are today
useful to build a polyhedral based range analysis on top of isl have
been developed together with Polly and with Polly as use case (e.g., to
build the optimistic assumption model we use) to enable a wide range of
optimizations and analyses. Seing more of them popping up now is indeed
> One might argue that Polly will become more modular when it is integrated
> into LLVM and when the intermediate results are used in more and more
> places. However, I'd say this is the "wrong direction" with regards to:
> 1) incremental design and development,
> 2) maintainability of individual parts, and
> 3) the development of a suitable cost model (which Polly does not
Polly has been developed from the very first patch in the open, was
discussed on the LLVM mailing lists from day one (and started from such
a discussion), and quickly became publicly visible LLVM project. I would
almost claim it is the perfect example of incremental development. When
we started due to license reasons and also due to a lack of mature
foundations, Polly was not part of core LLVM. Over the last years we
replaced and wrote new versions of all essential components for a move
to core LLVM to become possible.
Still, I get your point. It makes certainly sense to e.g. extract out
components of Polly to build a polyhedral range analysis and provide it
to other developers. Thanks for already providing a patch that
illustrates your ideas! Siddharth already commented and I would also
like to have a look. I also feel that Mehdi would be an excellent person
to give you feedback on this, as he has significant expertise with PIPS,
which uses abstract interpretation very similar to how one would use it
for a more general polyhedral range analysis.
However, I would decouple these two goals. We could move Polly and isl
in place, and then move parts of Polly to the range analysis you propose
-- or start copying / and redeveloping parts of Polly onto the new
foundations you propose.
Later we can replace more parts. In general, there has never been an
issue with temporarily having some redundancy in LLVM and I don't think
as part of this discussion we need to lay out the full future of
polyhedral analysis and transformations in LLVM. From my perspective, we
should answer mainly the question:
1) "Do we want polyhedral analysis as part of core LLVM?"
2) "Do we want to keep significant parts of the polyhedral development
outside of core LLVM?"
My feeling is that there are several groups interested in 1), which
gives me confidence that we will -- eventually -- have polyhedral stuff
in LLVM. Hal gave pretty good arguments for 2) and I also believe
strongly that we should break the boundaries between polyhedral and core
> Let me give two examples to make my point:
> -- Example 1 --
> In addition to the applicability issues there are other problems that
> arise from the full pipeline design. Let's assume we want to apply a
> pragma driven optimization to a loop that can be represented (and
> optimized) by Polly. Depending on the pragma we only need to perform a
> very specific task, e.g., adjust the iteration variable (loop inversion)
> or introduce new loops and adjust existing iteration variables (tiling).
> Polly will however represent (and actually analyze) the whole loop nest
> including the exact control flow and all accesses. In addition it
> will compute alias checks, constraints under which the representation
> is not truthful (e.g., an access function might overflow) and a lot of
> other things that are not needed for the task at hand.
Sure. Incrementally building dependences makes in certain cases sense!
> -- Example 2 --
> We can also imagine we want to determine if a loop can be vectorized,
> thus if there are (short) loop carried dependences. No matter which
> technique is used, the complexity of a dependence analysis will
> certainly increase with the number (and complexity) of the analyzed
> accesses. Since Polly is designed with fine-grained scheduling
> optimizations in mind, the dependences it will compute are
> disproportionate with regards to the question asked. Consequently, it
> will either take more time to determine if there are loop carried
> dependences or a time-out in the computation will cause a conservatively
> negative result.
> While it is generally possible to disable some "features" of Polly or to
> "work around them", it requires new independent code paths in an already
> highly interleaved project. The explicit (and implicit) interactions
> between the different parts of Polly are plentiful and subtle. I don't
> think there are 5 people that know about more than half of them. (I also
> doubt there is no one that knows them all.)
> I want to propose an alternative plan for which I would appreciate
> feedback. The feedback is especially useful since I will present this,
> or more accurately the concrete polyhedral analyses described below, in
> the student research competition at LLVM Dev meeting in October.
Nice. Looking forward to your talk!
> The idea is simple. We rewrite parts of Polly with two goals in mind:
> 1) to provide analysis results and transformation options to LLVM, and
> 2) allow polyhedral loop transformation.
Sure. Improved granularity is certainly a good thing to have. As stated
in the initial proposal, I am glad to see Polly being ripped apart. You
> A rewrite allows to revisit various design decisions that are buried
> deep in the Polly code and that hinder its application (in any context).
> The interested reader can find a short list at the end.
> I think, we should incrementally introduce polyhedral analysis
> capabilities in order to provide immediate benefits and to allow
> developers to get involved with the design and usage from an early point
> in time. Decoupled analyses also allow easier integration, interfaces
> that come "more natural" to non-polyhedral people, and better use of the
> capabilities if polyhedral scheduling is not the goal.
Many of these ideas sound very nice. I am certain you will provide some
good interface ideas.
> To demonstrate this incremental process I started to write a polyhedral
> value analysis**. Think of it as Scalar Evolution but with piece-wise
> defined polyhedral values as the abstract domain. This analysis is
> demand driven, works on the granularity of a LLVM value, is flow
> sensitive, and especially designed to be applicable on partially affine
> programs. It already supports most of what the polyhedral model does
> allow and includes a lot of things we prototyped in Polly. It shall be
> used when Scalar Evolution is not able to provide a sufficient result.
Nice! Clearly a way in the right direction.
> On top I implemented a polyhedral memory analysis*** that is (currently)
> capable of summarizing the memory effects of a code region, e.g., a
> function. The next step is a demand-driven dependence analysis**** that
> utilizes the (often) very structured nature of a CFG. Afterwards, I
> would like to see a transformation framework that allows classical but
> also scheduling based transformations.
> While these analyses still depend on isl as a back-end, they never
> directly use it. Instead there is a C++ interface in-between that
> encapsulates the features needed. Once our use cases are clear we can
> re-implement this interface.
Interesting. This seems to complement our efforts in providing a C++
interface to isl, but with slightly different goals. I feel as if we
use isl++ instead of plain isl within your interface.
> These are a few design decisions that I think should be revisited when
> we think about polyhedral-based analysis and optimizations in LLVM:
> - We should not rely on Scalar Evolution. Instead we want to build
> polyhedral values straight from the underlying IR. While
> Scalar Evolution did certainly simplify Polly's development it
> induced various problems:
> - The lack of piece-wise affine values (other than min/max)
> limits applicability and precision. (Though conditional SCEVs
> are a small step in this direction.)
> - The flow insensitive value analysis (Scalar Evolution) causes a
> lot of "assumptions", thus versioning, for the otherwise flow
> sensitive Polly.
> - There are caching issues when parts of the program are
> transformed or when Polly is used for inter-procedural
I agree, this is certainly a good direction.
> - We should not depend on single-entry-single-exit (SESE) regions.
> Instead we analyze the code optimistically to the extend possible
> (or required by the user). Afterwards we can easily specialize the
> results to the code regions we are interested in. While SESE regions
> always caused compile time issues (due to the use of the post
> dominance tree) they also limit the analysis/transformations that
> can be performed. As an example think of accesses that are
> non-affine with regards to the smallest surrounding SESE region but
> not in a smaller code region (e.g., a conditional). If the goal is
> not loop scheduling but, for example, instruction reordering, it
> is still valuable to determine the dependences between these
> accesses in the smaller code region.
SESE regions have (to my observation) never caused compile time issues.
However, several people including Chandler have raised concerns several
times and they indeed do not fit well when e.g. mixed with the
vectorizer. Hence, I agree that our ultimate polyhedral optimizer in
LLVM should likely not use SESE regions. In practice, Polly works
reasonably pretty well with regions for now.
> - We should iteratively and selective construct polyhedral
> representations. The same holds for analysis results, e.g.
> dependences. Let's reconsider example 2 from above. We want to visit
> one memory access at a time in order to build the polyhedral
> representation and compute the new dependences. We also want to
> immediately stop if we identify a loop-carried dependence. Finally,
> we do not want to compute dependences in outer loop iterations or
> dependences that are completely contained in one loop iteration.
Interesting. For certain use cases this is clearly the right approach.
For full-fledged scheduling this may not work.
> - We should encapsulate the code generation part completely from the
> transformation part. Additionally we might want to start with
> classical loop transformations instead of full polyhedral
> scheduling. For a lot of classical loop transformations code
> generation only needs to understand/change loop induction variables
> and exit conditions. Duplication + rewriting of the loop body is
> often not necessary. Pragma driven optimizations could be easily
> done and we could also use heuristics similar to the ones we have
> today (e.g., for loop distribution). The transformations we allow
> should than grow with the scheduling decisions we allow and these
> should be limited by a yet to be determined cost model.
Sure, better encapsulation is always good. How to do this exactly will
be an interesting discussion!
> I am aware that not all Polly developers will agree with my views and
> that some are simply trade-offs that do not offer a "correct way".
> Nevertheless, I hope that the simple examples I used to point out
> various problems are enough to consider the route I proposed.
I briefly looked at your patch for the polyhedral range analysis. While
I would have preferred an initial design discussion much earlier, and
your patch needs more comments and especially a design description, it
certainly is a move in the right direction! We may also split it in
smaller pieces to make review easier. In fact, for some reason I think
here at the RISC-V upstreaming. Not that we should go at it's speed, but
it has been a perfect example of how to upstream a new backend -- with a
lot of feedback from the community. This seems to be very much aligned
with your goals. Overall, I clearly agree with your proposal of a
polyhedral range analysis and will be glad to provide more review
Your development plans for the future are overall very much aligned with
what I would like to see for polyhedral transformations in LLVM. We
already discussed some of them over the last years, but you clearly also
bring some nice "revolutionary" ideas into the game. I am really excited
to see how these will evolve! I am also glad to see you again more
active in the mailing list!
More information about the llvm-dev