[llvm-dev] RFC: Import of Integer Set Library into LLVM source tree

Thu Jan 18 06:38:06 PST 2018

On Thu, Jan 18, 2018, at 15:02, Michael Kruse via llvm-dev wrote:
> 2018-01-18 6:40 GMT+01:00 Chris Lattner <clattner at nondot.org>:
> > Great, I think that that would be a fine approach: you can have the cmake logic detect which version of isl is installed and fail if it is the wrong version.  This would address my concern.
> 
> > The motivation explained in the original email seemed to hinge around the fact that Polly’s unit tests depend on accidental details of ISL that change across releases.  If you can fix that, then it seems more plausible to unbundle it.
> >
> > If you can’t fix that, and can’t replace ISL, then the best approach seems to have CMake detect and reject incompatible versions.
> 
> This would require buildbot admins to regularly install new versions of isl.
> 
> Isl implements the main loop optimization algorithms, so any change in
> isl may potentially change Polly's output. This is not accidental but
> intended, to profit from improvements in isl's loop optimization
> algorithms.

Specifically, we want to give isl the ability to exploit freedom in representing integer sets (set of linear affine constraints) as long as this preserves semantics. Such kind of simplifications are very useful when collecting assumptions (as SCEV meanwhile does). While SCEV does not yet perform many simplifications over the constraint sets it collects, isl does (and uses ILP solvers to derive these):  One of the simplifications we are doing is e.g. set coalescing: http://impact.gforge.inria.fr/impact2015/papers/impact2015-verdoolaege.pdf . Simpler representations in the end will correspond to simpler generated code e.g. for run-time checks, but means that improvements in isl will change test cases.

> With the side-effect of input/output pairs (unit tests) to
> be version-locked to a specific version of isl.
> 
> Isl currently has 177052 lines in source files. We thought about
> writing a C++ implementation of it, but then again it would take
> years, introduce new bugs and split the isl community.

Michael, Me, Sebastian, Zino and others spent some time in performance tuning isl. Compared to the original gmp version we got about 2x speedup across all our experiments and compared to the original imath version (which was really slow) I feel it was more around 7-8x. We also played with fixed-size-integer versions of isl which could give another 10-30% speedup and I would guess that further reducing malloc traffic certainly will give us another constant speedup. Some other optimizations that certainly make sense are:

- Ensure hot loops are vectorized
- Potentially version parts of isl on the data-types they work with

I feel there is certainly some non-trivial speedup to be gained. However, the largest speedups we got came over the last years from algorithmic changes. Some interesting ones that still require more work are:

- Look into sparse constraint sets
- Avoid redundant computations by caching semantic properties
- Avoid duplicate computations

In fact, we are in parts on very early experiments to replace parts of the isl stack with C++ alternatives and this seems something we should regularly reconsider.  I would like to encourage discussions if / what parts of isl would benefit from being rewritten, but I feel this is something we should do as a separate more forward looking discussion. Having isl in-tree will give us a solid baseline as foundation for such discussions.

Best,
Tobias