[PATCH] D133166: [mlir][MemRef] Canonicalize extract_strided_metadata(subview)

Wed Sep 7 12:12:38 PDT 2022

qcolombet marked 2 inline comments as done.
qcolombet added inline comments.

================
Comment at: mlir/lib/Dialect/MemRef/Transforms/SimplifyExtractStridedMetadata.cpp:88
+    bool hasKnownStridesAndOffset =
+        succeeded(getStridesAndOffset(sourceType, sourceStrides, sourceOffset));
+    // Helper function to get either the value of the newly created
----------------
nicolasvasilache wrote:
> qcolombet wrote:
> > Is it possible for the `getStridesAndOffset` method to fail on the source of a `subview`?
> > 
> > Put differently, how can I write a test for that part?
> If `sourceType` come from a valid subview op then you shouldn't be able to fail here.
> You may want to assert success instead ans simplify code below.
I was suspecting something like that.

Thanks for the confirmation.

================
Comment at: mlir/lib/Dialect/MemRef/Transforms/SimplifyExtractStridedMetadata.cpp:107
+      return origStrides[idx];
+    };
+
----------------
nicolasvasilache wrote:
> qcolombet wrote:
> > FWIW, I'm not super happy with this helper function, because we end up materializing (and then deleting later) constants.
> > 
> > I've tried to use the `getStridesAndOffset` function directly with the `AffineExpr` constructs, but I was not satisfied with the code.
> > 
> > Essentially, I would end up in the main loop below with something like:
> > ```
> > AffineConstantExpr cst;
> > 
> > if (known  && (cst = strideExpr.dyn_cast<AffineConstantExpr>())) {
> >   makeComposedFolded(s0 * cst, {subStride})
> > } else {
> >   makeComposedFolded(s0 * s1, {subStride, origStride})
> > }
> > ```
> > 
> > The main problem here is that the number of operands is different (`subStride` vs. `subStride, origStride`), so getting rid of the if is not easy (at least I didn't see a nice way to do it.)
> > 
> > What I tried for that was something like:
> > ```
> > Expr = s0
> > Operands = {subStride}
> > If ()
> >   Expr *= cst
> > Else {
> >   Expr *= s1
> >  Operands.push_back(origStride)
> > }
> >  makeComposedFolded(s0 * Expr, Operands)
> > ```
> > 
> > And I didn't find it particularly readable.
> I'm having trouble seeing it in this form but there must be significantly simpler ways of writing this.
> Let's apply the first round of cleanups and revisit?
With the suggested cleanups it doesn't look bad.
We still create the constants and delete them on the fly though.
If you have ideas on how to fix that, that would be great!

================
Comment at: mlir/lib/Dialect/MemRef/Transforms/SimplifyExtractStridedMetadata.cpp:120
+
+      offset = makeComposedFoldedAffineApply(
+          rewriter, origLoc, s0 + s1 * s2, {offset, subOffsets[i], newStride});
----------------
nicolasvasilache wrote:
> Iteratively composing makes it hard to control your operands and indices here.
> 
> You could do something like:
> ```
> SmallVector<Value> newStrides;
> for (unsigned i = 0; i < sourceRank; ++i) {
>   newStrides.push_back(
>     makeComposedFoldedAffineApply(rewriter, origLoc, s0 * s1, {getOrigStrideAtIdx(i), subStrides[i]}));
> }
> 
> SmallVector<Value> values(2 * sourceRank + 1);
> SmallVector<AffineExpr> symbols(2 * sourceRank + 1);
> 
> // Note: you may need to impl. bindSymbols yourself as the existing version works for variadic templates only.
> bindSymbols(symbols, ctx);
> AffineExpr expr = symbols.front();
> for (unsigned i = 0; i < sourceRank; ++i) {
>   expr = expr + symbols[1 + 2 * i] * symbols[1 + 2 * i + 1]; 
>   values.push_back(getOrigStrideAtIdx(i));
>   values.push_back(subStrides[i]);
> }
>     
> Value finalOffset = makeComposedFoldedAffineApply(rewriter, origLoc, expr, values));
> ```
> 
Great!
The order makes much more sense now and more importantly is predictable.

Thanks!

BTW, I have one loop for the new strides and one loop for the offset. Let me know if you prefer that we merge both loops.

================
Comment at: mlir/test/Dialect/MemRef/simplify-extract-strided-metadata.mlir:219
+//
+//  CHECK-DAG: %[[FINAL_OFFSET:.*]] = affine.apply #[[$OFFSET_MAP]]()[%[[DYN_OFFSET2]], %[[DYN_OFFSET1]], %[[OFFSET]], %[[DYN_OFFSET0]], %[[STRIDES]]#0, %[[DYN_STRIDE0]], %[[STRIDES]]#1, %[[DYN_STRIDE1]], %[[STRIDES]]#2, %[[DYN_STRIDE2]]]
+//
----------------
nicolasvasilache wrote:
> qcolombet wrote:
> > Nit question: Is there a way to produce an affine map, with a nice ordering of the arguments?
> > E.g., origOffset, stride0, subStride0, subOffset0, ...
> > 
> > Although the current map is correct, the arguments are all over the place :).
> See my comment above, I believe it's because you compose AffineMaps iteratively.
> You could create a flat list of AffineExpr with the proper expressions, match that to the proper value and thus better control the alignment of values.
> This would also resist canonicalizations (e.g. if 2 values are the same the corresponding symbol gets dropped and the other ones shifted preserving the original alignment on the other values).
> 
Works great!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133166/new/

https://reviews.llvm.org/D133166