[PATCH] D80860: Exact integer emptiness checks for FlatAffineConstraints

Fri Jun 5 01:03:44 PDT 2020

bondhugula added inline comments.

================
Comment at: mlir/include/mlir/Analysis/Presburger/Matrix.h:66
+  unsigned nColumns;
+  std::vector<std::vector<INT>> data;
+};
----------------
bondhugula wrote:
> ftynse wrote:
> > arjunp wrote:
> > > arjunp wrote:
> > > > Kayjukh wrote:
> > > > > arjunp wrote:
> > > > > > Kayjukh wrote:
> > > > > > > Why not store the matrix data in a one-dimensional array?
> > > > > > This makes the code simpler. In that case, resizing would be more expensive, and resizing is not uncommon, so it's unclear to me if performance considerations weigh in favour of having it be one-dimensional.
> > > > > If resizing is a common operation then I agree that it can make sense to keep nested `SmallVector`s. If there is a noticeable performance impact in can always be fixed in a follow-up.
> > > > In fact, resizing wouldn't be a problem because we only resize vertically, i.e. we only ever change the number of columns. But another operation that's used frequently is `swapRows` which should also be faster under this representation, so I think we can still keep it this way for now.
> > > Sorry, I meant "we only ever change the number of rows".
> > This rationale should be documented in the code. The interplay between resizes and cache behavior in case of consecutive reads (which I suppose are much more common) is non-trivial, especially if we allow `reserve`-ing space. I'm also slightly concerned about using 8k bytes on stack by default, we rarely use more than 8 stack elements in small vectors in MLIR.
> Please avoid using vector of vector or SmallVector of SmallVector here. (A SmallVector isn't really optimized to hold other vectors or SmallVectors as its elements.) All of these would be bad enough for performance to warrant using a 1-d storage to start with. Using a single (contiguous) buffer for all the coefficients (a single allocation holding it) will give you cache locality (spatial), fewer/no conflict misses, better prefetching, and fewer TLB misses (just one in nearly all cases). You'll also get completely inline / stack allocations for small matrices. Your API won't be really impacted. FlatAffineConstraints uses such a single buffer of coefficients via a SmallVector (you  can actually copy over its relevant methods - instead of equalities and inequalities, there's just coefficients here.) Resizing isn't really a downside with this. Adding rows is straightforward since they are at the tail, and adding columns will require movement within the contiguous buffer, which is fine. FAC reserves columns to reduce movement most of the time for column addition. With that, I don't see any additional resizing or movement penalty with a 1-d buffer over vector of vectors - you have too many allocations with the latter.
> 
Esp. for mid-size matrices like 10x10 or 20x20, all of the allocs and deallocs from a smallvector of smallvector will be a heap killer!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80860/new/

https://reviews.llvm.org/D80860