[llvm-dev] RFC: Import of Integer Set Library into LLVM source tree

Tue Jan 23 22:25:42 PST 2018

Hi Hal,

I appreciate you sharing your thoughts, but you’re not really answering the questions I asked in my prior email here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120529.html <http://lists.llvm.org/pipermail/llvm-dev/2018-January/120529.html>

To restate more plainly, given that we are (still!) using SVN, I consider the LLVM repo to be too monolithic and poorly layered.  It is a feature of LLVM that important related projects (like Clang and Polly) are separate SVN repos.  Furthermore, the bar to be included in the default clang optimization pipeline is high and hasn’t been met by Polly yet.  If (and when?) it is, it would be no problem for Clang to depend on both Polly and LLVM, so I don’t see rationale for merging the repos.

I share your goal of integrating the Polly community into the more “standard” LLVM flows, but I don’t see how that necessitates integration of Polly into the LLVM repo itself.  By point of comparison, LLDB has similar problems to Polly (until recently the code formatting was totally different, some of its core development principles are different, and it is generally managed differently than a standard LLVM project). Another example is Clang which runs very similar to LLVM but is also a separate repo.   No one is talking about moving those projects into the LLVM repo.

In case it helps to find common ground, I’m a huge fan of polyhedral optimizations and techniques when applied to domains on which they have high leverage.  I happen to be doing some machine learning stuff and polyhedral optimizations are extremely relevant to some of the things we are doing.  That said, I have yet to see evidence that they have broad application outside of specific niche domains (yes, like ML and HPC).  

Furthermore, while I can understand the ambition of some folks to replace the existing loop optimizer with Polly, but such a proposal needs overwhelming empirical support to make happen, and it is clear that we aren’t there (yet).

RE: ISL, as I’ve said on the other thread, there are several possible solutions to how to handle ISL with various tradeoffs, but that seems completely orthogonal to whether Polly is in the LLVM repo or not: we need to answer those questions in either case.

-Chris

> On Jan 21, 2018, at 8:37 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> Hi, Nadav, Chris, et al.,
> 
> If you've not already seen it, we had a long discussion about incorporating Polly into LLVM on llvm-dev,http://lists.llvm.org/pipermail/llvm-dev/2017-September/117063.html <http://lists.llvm.org/pipermail/llvm-dev/2017-September/117063.html> (with a continuation in October:http://lists.llvm.org/pipermail/llvm-dev/2017-October/118125.html <http://lists.llvm.org/pipermail/llvm-dev/2017-October/118125.html>) with a lot of detailed information.
> I think it is important, first, that we agree on the goals of this effort. While Polly is currently used as an isolated analysis-and-transformation engine, it won't stay that way. The goal is to split, refactor, and rewrite (as needed) the current components so that we have four things:
> 
>  1. A dependence analysis driven by state-of-the-art polyhedral analysis.
>  2. A framework for easily constructing loop transformations on LLVM IR.
> 
>  3. Loop transformations built on these capabilities. These might be canonicalization transformations in addition to cost-model-driven optimization transformations.
> 
>  4. A generic (e.g., ILP-driven) scheduling capability for a wide class of transformations.
> 
> Especially as we leave the realm of inner loops, we need a better dependence analysis than what we have currently in LoopAccessAnalysis and DependenceAnalysis. Moreover, this dependence analysis needs the ability to interact with a capability for generating dynamic legality predicates (for multiversioning; something that Polly can do). Dependence analysis using polyhedral modeling is appealing for several reasons. It provides a robust solution that seems likely easier to maintain than an every-expanding set of special tests. Moreover, it isolates the mathematical reasoning behind a well-defined interface allowing for proper separation of concerns.
> Working with loops in LLVM currently is unnecessarily difficult. Just last week I was writing a transformation which split a loop into several consecutive loops (where each of the p loops had ~n/p iterations). This seems like a simple task, but even using (what I believe to be) our best utilities from SCEV, LoopUtils, etc., by the time I dealt with multiple induction variables, reductions, and more, plus keeping needed analyses up to date, this transformation took hundreds of lines of code. We need a better way of doing this. One of the capabilities within Polly is, essentially, an infrastructure for rewriting loop nests. Our plan includes refactoring this capability into a generic loop-rewriting utility that can be used to implement a variety of different loop transformations. This should make of our loop transformations, current and future, easier to write, more robust, and easier to maintain.
> 
> One thing we can do immediately with this infrastructure, even in its current form, is provide a user-driven (i.e., metadata-directed) set of transformations encompassing nearly all common loop-nest transforms (interchange, fission/fusion, skewing, tiling). Importantly, we can do this, not only where the user assert safety, but using polyhedral analysis, also where we generate legality predicates. This would be a significant step toward making LLVM-based compilers in the top tier of HPC compilers, but having such capabilities only conditionally supported in the backend would be an unfortunate complication for our users.
> 
> The automated loop-nest rewriting (using an ILP-solver to pick some optimal schedule), which is what many people might think of as the primary usage modality of polyhedral loop optimizations, and thus Polly, is only a part of this overall program for improving LLVM's infrastructure for handling loops. This is an important capability, but among many other factors, depends on good cost modeling (an area in which Polly currently needs work). If at some point the cost modeling, etc. improves to the point where we could have something like this in the default optimization pipeline, that would be great (from my perspective), but that's only a small potential benefit from this integration.
> 
> In addition to these technical aspects, there are community considerations, but with respect to Polly, and with respect to the isl library on which it depends. Obviously there are different ways we can develop these capabilities in LLVM, but I prefer a way that takes as much of the existing Polly community as possible and merges it into the LLVM community. This requires preserving Polly's current capabilities during this development process, and ideally, not partitioning between the "old" separate-repository code base and the "new" in-tree code base. The tricky part here is what to do with isl. To be clear, isl has already been relicensed once with the specific goal of better LLVM integration, the dependency on GMP has been removed, and further changes along these lines are not unthinkable. isl has its own developer community, but many of the core Polly developers are also significant isl contributors, and so there's significant overlap. The remainder of the isl community is a significant resource, and whether or not we wish to separate it from our development in this area is something we should specifically consider. I personally suspect that we'll end up rewriting a lot of what's in isl eventually, but we'd need to figure out what to do in the mean time. isl is a moving target, like LLVM itself, and given the developer overlap between Polly and isl, I don't think it makes sense to think of isl as some kind of truly external dependency. We'd likely need to version lock anyway, so the benefit of having isl in a separate repository on separate servers seems like a system providing all of the disadvantages of an external dependency with few/none of the benefits. This is why Polly currently has the needed version of isl in its repository, and in order to move Polly into LLVM itself, moving isl seems like the first step.
> Finally, LLVM is a major dependency of many external projects, and binary side is certainly a concern. In general, we need a better way to enable not compiling transformations not needed by a particular project. Polyhedral, or any other, loop transformations are no different from CFL alias analysis, sanitizer instrumentation passes, or any number of other things in this regard. We should have a better infrastructure for partial builds but we shouldn't hold up any particular contribution because of deficiencies in this area. Also, I believe that the infrastructure added will be useful for transformations important to many different frontends (e.g., bounds-check removal and canonicalizing transformations).
> Thanks again,
> Hal
> 
> On 01/20/2018 12:47 PM, Nadav Rotem via llvm-dev wrote:
>> 
>> Hi Tobi, 
>> 
>> I have some concerns about adding Polly into LLVM proper. I think that it's great that Polly is a part of the LLVM umbrella of projects, like Clang and LLDB. However, I am not convinced that Polly belongs in the LLVM compiler library. LLVM is a major dependency for so many external projects. Rust, Swift, GPU drivers by different vendors, and JIT compilers all rely on LLVM. Projects that depend on LLVM are going to pay the cost of adding Polly into the LLVM library. These projects operate under a different set of constraints and optimize different metrics. One of my main concerns is binary size. The size of the LLVM compiler library matters, especially on mobile, especially for JIT compilers. Growing the size of the LLVM binary increases the app load time (because the shared object needs to be read from disk). Moreover, the current size of the LLVM library prevents people from bundling a copy of LLVM with mobile apps because app stores limit the size of apps. Yes, I know that it's possible to disable Polly in production scenarios, but this looks like an unnecessary hurdle. 
>>   
>> Would it be possible to use the LLVM plugin mechanism to load Polly dynamically into the clang pass manager instead of integrating it into the LLVM tree? 
>> 
>> Thanks,
>> Nadav 
>> 
>> 
>> On Jan 15, 2018, at 01:33 PM, Tobias Grosser via llvm-dev <llvm-dev at lists.llvm.org> <mailto:llvm-dev at lists.llvm.org> wrote:
>> 
>>> Dear LLVM community,
>>> 
>>> hope all of you had a good start into 2018 and a quiet branching of LLVM 6.0.
>>> 
>>> With the latest LLVM release out of the way and a longer development phase starting, we would like to restart the process of including Polly and isl into core LLVM to bring changes in early on before the next LLVM release.
>>> 
>>> Short summary:
>>> 
>>> * Today Polly is already part of each LLVM release (and will be shipping with LLVM 6.0) for everybody to try,                 with conservative defaults.
>>> * We proposed to include Polly and isl into LLVM to provide modern high-level loop optimizations into LLVM
>>> * We suggested to develop Polly and isl as part of core LLVM to make interactions with the core LLVM community easier and to allow us to better integrate Polly with the new pass manager.
>>> 
>>> Let me briefly summarize the current status:
>>> 
>>> * Michael sent out an official email to discuss how to best include isl into LLVM
>>> (http://lists.llvm.org/pipermail/llvm-dev/2018-January/120408.html <http://lists.llvm.org/pipermail/llvm-dev/2018-January/120408.html>)
>>> * We sent out the LLVM developers meeting notes (_http://lists.llvm.org/pipermail/llvm-dev/2018-January/120419.html_ <x-msg://105/_http://lists.llvm.org/pipermail/llvm-dev/2018-January/120419.html_>)
>>> * Philip Pfaffe prepared a preliminary patch set for integrating Polly closer into LLVM:
>>> _https://github.com/pfaffe/llvm-project-20170507/commits/merge-polly-into-upstream_ <x-msg://105/_https://github.com/pfaffe/llvm-project-20170507/commits/merge-polly-into-upstream_>
>>> (further cleanup needed)
>>> * We are working further with ARM (Florian Hahn and Francesco) to upstream the inliner changes needed for the end-to-end optimization of SPEC 2006 libquantum.   _https://reviews.llvm.org/D38585_ <x-msg://105/_https://reviews.llvm.org/D38585_>
>>> * Oleksandr, Sven and Manasij Mukherjee started to look into spatial locality
>>> * We worked on expanding the isl C++ bindings (_http://repo.or.cz/isl.git/shortlog_ <x-msg://105/_http://repo.or.cz/isl.git/shortlog_>). While a first set of patches is already open, further patches will follow over the next couple of weeks.
>>> 
>>> Let me briefly summarize the LLVM developer meeting comments on our proposal (subjective summary)
>>> 
>>> * Most people were interested in having polyhedral loop optimizations being part of LLVM.
>>> * Ideas of uses of isl beyond polyhedral loop scheduling were raised (e.g., for polyhedral value analysis, dependence analysis, or broader assumption tracking). Others were interested in the use of polyhedral loop                 optimization with “learned” scheduling strategies.
>>> * Specific concerns were raised that an integration of Polly into LLVM may be an implicit choice of LLVMs loop optimization future. This is not the case. While Polly is today the only end-to-end high-level loop optimization, other approaches can and should explored (e.g., can there be synergies with the region vectorizer?)
>>> * How stable/fast/… is Polly today
>>> * We build all of AOSP with rather restrictive compile-time limits
>>> * Bootstrapping time of clang is regressed by 6% (at most)
>>> * Removal of scalar dependences is today very generic and must be sped up in the future
>>> * Polly still shows up at the top of the middle-end, but larger compile time regressions are often due to increased code size (and the LLVM backend)
>>> * We see non-trivial speedups for hmmer, libquantum, and various linear-algebra kernels (we use gemm-specific optimizations). The first two require additional flags to be enabled.
>>> 
>>> The precise inclusion agenda has been presented here:
>>> 
>>> http://lists.llvm.org/pipermail/llvm-dev/2017-September/117698.html <http://lists.llvm.org/pipermail/llvm-dev/2017-September/117698.html>
>>> 
>>> After having merged communities, I suggest to form a loop optimization working group which jointly discusses how LLVM’s loop optimizations should evolve.
>>> 
>>> I would like to invite comments regarding this proposal.
>>> Are there any specific concerns we should address before requesting the initial svn move?
>>> 
>>> Best,
>>> Tobias
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180123/5b248b68/attachment-0001.html>