[llvm-branch-commits] [mlir] [mlir][Transforms] Add 1:N `matchAndRewrite` overload (PR #116470)

Mon Nov 18 04:04:41 PST 2024

================
@@ -1376,14 +1423,36 @@ void ConversionPatternRewriterImpl::insertNTo1Materialization(
     legalOutputType = replacements[0].getType();
   }
   if (legalOutputType && legalOutputType != originalType) {
+    UnrealizedConversionCastOp targetCastOp;
     Value targetMat = buildUnresolvedMaterialization(
         MaterializationKind::Target, computeInsertPoint(argMat), loc,
         /*inputs=*/argMat, /*outputType=*/legalOutputType,
-        /*originalType=*/originalType, converter);
+        /*originalType=*/originalType, converter, &targetCastOp);
+    if (targetCastOp)
+      nTo1TempMaterializations.insert(targetCastOp);
     mapping.map(argMat, targetMat);
   }
 }
 
+SmallVector<Value>
+ConversionPatternRewriterImpl::unpackNTo1Materialization(Value value) {
+  // Unpack unrealized_conversion_cast ops that were inserted as a N:1
+  // workaround.
+  auto castOp = value.getDefiningOp<UnrealizedConversionCastOp>();
+  if (!castOp)
+    return {value};
+  if (!nTo1TempMaterializations.contains(castOp))
+    return {value};
+  assert(castOp->getNumResults() == 1 && "expected single result");
+
+  SmallVector<Value> result;
----------------
matthias-springer wrote:

`decompose-call-graph` decomposes nested tuples in one step. There is not recursive pattern application or materializations.

The recursion here is needed because the driver sometimes inserts a combination of argument materialization + target materialization (back-to-back). We want to skip both. You can see both being created in `insertNTo1Materialization`. I added a comment that clarifies this. I also added a test case to `TestPatterns.cpp`. As part of this, I also found an issue in `remapValues` that required updating a few more patterns in `TestPatterns.cpp` to 1:N. (I prefetched these fixes from #116524.)

The use case that you mentioned is not supported at the moment. The reason for that is that we cannot simplify traverse the use-def chain of `unrealized_conversion_cast` ops because op replacements are materialized in a delayed fashion. I.e., by traversing the `unrealized_conversion_cast` chain, we would see one op replacement, but not the replacement of the source value. This would be difficult to implement because we have to examine the `mapping` in this function. This issue will be fixed with #116524, when 1:N replacements are stored in the `mapping` and we no longer have to "unpack" materializations.


https://github.com/llvm/llvm-project/pull/116470