[PATCH] D133625: [mlir][MemRef] Simplify extract_strided_metadata(expand_shape)

Wed Sep 14 23:08:34 PDT 2022

nicolasvasilache added inline comments.

================
Comment at: mlir/lib/Dialect/MemRef/Transforms/SimplifyExtractStridedMetadata.cpp:360
+
+          newSizeOFR = makeComposedFoldedAffineApply(
+              rewriter, origLoc,
----------------
qcolombet wrote:
> nicolasvasilache wrote:
> > I'm sorry this is really hard to follow and I doubt this code will be maintainable..
> What do you think would make the code more understandable / maintainable?
> 
> Essentially for every reassociation group we have at most one dynamic size.
> What this code does is the computation of this dynamic size as: `origSize / product(all other sizes in that group)`.
> 
> E.g., for `?x4` -> [0,1,2][3] `2x?x3x4`
> This will compute the `?` of the resulting shape as an affine.apply of `()[s0] -> (s0 / (2 * 3))`, where `s0` is the original dynamic dimension for that group.
> 
> The `productOfKnownStaticSize` holds the product of the static sizes for the related group (2 * 3 in the example), then `dynSizeInput` hold the original size for that group.
> 
> Maybe I could split the loop between the dimensions that come before the dynamic dimension and the ones that come after. That way I would not have to "propagate" a fake dynamic size expr to keep the final strideExpr stable between the dynamic and non-dynamic case, which should help readability.
```Essentially for every reassociation group we have at most one dynamic size.```

No need to worry about this immediately but this will change (e.g. https://github.com/llvm/llvm-project/issues/51564)

Yes, less deeply nested fusion would def. be a good start, then we can iterate.

================
Comment at: mlir/lib/Dialect/MemRef/Transforms/SimplifyExtractStridedMetadata.cpp:268
+
+      // Calculate product(expandShapeSize#j, for j in
+      // reassIdx+1..reassIdx+group.size-1) for each dimension of that group.
----------------
qcolombet wrote:
> nicolasvasilache wrote:
> > qcolombet wrote:
> > > nicolasvasilache wrote:
> > > > Can anything be refactored and reused from `MemRefOps.cpp::namespace saturated_arith {` ?
> > > Thanks for the pointer, I didn't know about these utilities.
> > > 
> > > At first glance, it doesn't look like it would fit here.
> > > 
> > > In particular, here we don't saturate when we see a dynamic stride/size, we replace it by 1 and patch up the expression with the proper dynamic value later.
> > > 
> > > The suffix product that we do here is similar to what `computeExpandedLayoutMap` does, except in our case we don't want to saturate.
> > > 
> > > I'm guessing the suffix product is used in a lot of places given that's how strides are computed, maybe we can refactor on that front. Though how we deal with dynamic dimensions is where things diverge and I don't know how to reconcile that at the moment.
> > > 
> > > Put differently, I don't know yet what the refactoring should look like.
> > > 
> > > Do you want me to spend more time looking into this or should we move on for now and wait for more use cases to show up?
> > I was hoping dome refactoring into simple utils would be possible because the code below feels like it's been rewritten a bunch of times in different ways.
> > 
> > For instance, I find the logic of the `hasDynamicSize` + `productOfKnownStaticSize` to be quite tricky to follow with further uses of `hasDynamicSize`.
> > I would hope that we can express this much more like functional-style applications without all these loops and levels of nesting.
> > 
> > For instance, l269 - 300 should look like:
> > ```
> > SmallVector<int64_t> suffixProductOfSizes = 
> >   suffixProduct(expandShapeType.getShape(), begin, end, [](int64_t val, int64_t acc){ return saturated_arith(...); })
> > ```
> > 
> > I made similar usability comments in https://reviews.llvm.org/D128986 but not exactly the same.
> > 
> > TL;DR I'd really prefer we avoid complex loopy logic for things that can be functional-style and should be reusable.
> Okay, let me take a closer look.
> 
> Again, unless I missed something, the saturated_arith constructs are not good in that case because IIUC it saturates on the first dynamic value, whereas I need to get the full product without the dynamic dimension.
> 
> E.g., for `2x?x3`, I need `6`, not `saturated`, because this number will let us compute the actual value of `?` from the original dimension for that group.
> 
> I feel that saturated_arith is well suited to construct the type of a shape (e.g., a stride is dynamic as soon as one of the dimension is dynamic), but not so much to compute actual values.
Yes, I see that the `saturated_arith` itself is no use when you want to construct the AffineExpr, I just wanted to give you a feel of an API that avoids deep nesting and that will be easier to maintain longer term.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133625/new/

https://reviews.llvm.org/D133625