[Mlir-commits] [mlir] [MLIR][SCF] Add an API to fuse consumer to a producer within scf loop (PR #88712)

Quinn Dawkins llvmlistbot at llvm.org
Fri May 24 06:44:14 PDT 2024


================
@@ -160,6 +215,21 @@ struct LinalgOpTilingInterface
     return success();
   }
 
+  FailureOr<TilingResult> getTiledImplementationFromOperandTile(
+      Operation *op, OpBuilder &b, unsigned operandNumber,
+      ArrayRef<OpFoldResult> offsets, ArrayRef<OpFoldResult> sizes) const {
+    SmallVector<OpFoldResult> mappedOffsets, mappedSizes;
+    auto tilingInterfaceOp = cast<TilingInterface>(op);
+    if (failed(tilingInterfaceOp.getIterationDomainTileFromOperandTile(
+            b, operandNumber, offsets, sizes, mappedOffsets, mappedSizes))) {
+      return emitError(
+          op->getLoc(),
+          "unable to obtain the iter domain position of the operation.");
+    }
+    return tilingInterfaceOp.getTiledImplementation(b, mappedOffsets,
+                                                    mappedSizes);
----------------
qedawkins wrote:

> Now, IIUC the problem is that getTiledImplementation currently assumes operating on the complete, untiled op which accepts full operands, and uses that to produce tiles of the operands. This assumption doesn't hold in the case when the consumer is forcibly moved in the loop. We seem to all agree that this assumptions should be lifted.

Yes, this is the core problem as I see it. Any attempt to do otherwise has the risk of either feeding the tiling interface invalid IR (non-starter), or relies on an implicit contract that `getOperandTileForIterationDomainTile` actually extracts the slice it asks for (which yes, generally seems like a reasonable requirement, but it's a subtle implicit link between interface methods that certainly feels easy to get wrong as you noted).

> My thinking here is that getTiledImplementation does too much. It should not be emitting IR that computes a slice of each operand. Instead, it should be given such slices.
> ...
> Instead consumer fusion will directly supply getTiledImplementation with the slice of the operand that is just the result of the tiled producer (before insert_slice)

+1, this was what I was grasping at with my initial comment but didn't give a conclusive suggestion. With that said, tiling implementations are often critical pieces of code and throwing in an API breakage like that in a 150+ comment PR seems like the wrong way to do it and likely needs an RFC. I don't want to block this PR on a long upstream discussion, so landing a form of `getTiledImplementationFromProblemSlice` and deprecating `getTiledImplementation` could be a way to stage it. In the interim we can then do the reverse on in-tree tiling interface implementations by making `getTiledImplementation` a trivial composition of `getOperandTileForIterationDomainTile` and `getTiledImplementationFromProblemSlice` with the expectation of removing `getTiledImplementation`.

> My preferred way to move this forward, if we all agree on the proposed decomposition of methods, is to make a separate PR that splits out getOperandTileForIterationDomainTile and uses it in the existing case, and rebase this PR on top of that. Though I could also accept these changes directly in this PR if rebasing is too involved.

I should have read further... This staging seems good to me, but would like @Abhishek-Varma to weigh in on the expected work here.

https://github.com/llvm/llvm-project/pull/88712


More information about the Mlir-commits mailing list