[PATCH] D80860: Exact integer emptiness checks for FlatAffineConstraints

Fri Jun 5 00:30:41 PDT 2020

bondhugula requested changes to this revision.
bondhugula added a comment.
This revision now requires changes to proceed.

I am happy to see this submitted for review. An exact integer emptiness check had been missing and having one would be a really useful contribution. Although expensive in theory, for many of the use cases here (which are typically low / fixed dimensionality ones), things may be practical and fast enough. It just has be carefully used so as to not be detrimental to compile time for "standard" enough use cases - since there may be alternative faster approaches. It's nevertheless useful to have this technique at our disposal and have this maintained and optimized for use cases of interest, including for experimental/research purposes.

I strongly agree with all of @ftynse's comments; they already cover pretty much all of what I may have wanted to say. Just one additional comment below on the storage for Matrix.

================
Comment at: mlir/include/mlir/Analysis/Presburger/Matrix.h:66
+  unsigned nColumns;
+  std::vector<std::vector<INT>> data;
+};
----------------
ftynse wrote:
> arjunp wrote:
> > arjunp wrote:
> > > Kayjukh wrote:
> > > > arjunp wrote:
> > > > > Kayjukh wrote:
> > > > > > Why not store the matrix data in a one-dimensional array?
> > > > > This makes the code simpler. In that case, resizing would be more expensive, and resizing is not uncommon, so it's unclear to me if performance considerations weigh in favour of having it be one-dimensional.
> > > > If resizing is a common operation then I agree that it can make sense to keep nested `SmallVector`s. If there is a noticeable performance impact in can always be fixed in a follow-up.
> > > In fact, resizing wouldn't be a problem because we only resize vertically, i.e. we only ever change the number of columns. But another operation that's used frequently is `swapRows` which should also be faster under this representation, so I think we can still keep it this way for now.
> > Sorry, I meant "we only ever change the number of rows".
> This rationale should be documented in the code. The interplay between resizes and cache behavior in case of consecutive reads (which I suppose are much more common) is non-trivial, especially if we allow `reserve`-ing space. I'm also slightly concerned about using 8k bytes on stack by default, we rarely use more than 8 stack elements in small vectors in MLIR.
Please avoid using vector of vector or SmallVector of SmallVector here. (A SmallVector isn't really optimized to hold other vectors or SmallVectors as its elements.) All of these would be bad enough for performance to warrant using a 1-d storage to start with. Using a single (contiguous) buffer for all the coefficients (a single allocation holding it) will give you cache locality (spatial), fewer/no conflict misses, better prefetching, and fewer TLB misses (just one in nearly all cases). You'll also get completely inline / stack allocations for small matrices. Your API won't be really impacted. FlatAffineConstraints uses such a single buffer of coefficients via a SmallVector (you  can actually copy over its relevant methods - instead of equalities and inequalities, there's just coefficients here.) Resizing isn't really a downside with this. Adding rows is straightforward since they are at the tail, and adding columns will require movement within the contiguous buffer, which is fine. FAC reserves columns to reduce movement most of the time for column addition. With that, I don't see any additional resizing or movement penalty with a 1-d buffer over vector of vectors - you have too many allocations with the latter.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80860/new/

https://reviews.llvm.org/D80860