[Mlir-commits] [mlir] [mlir][Linalg]: Optimize any structured linalg operation in transform::PromoteOp to avoid unnecessary copies (PR #69876)

Mon Oct 30 22:49:51 PDT 2023

================
@@ -54,7 +54,8 @@ func.func @matmul_f32(%A: memref<?xi8>, %M: index, %N: index, %K: index) {
 
 //       CHECK:         linalg.copy ins(%[[vA]] : memref<?x?xf32, strided<[?, 1], offset: ?>>) outs(%[[partialA]] : memref<?x?xf32, strided<[?, 1], offset: ?>>)
 //       CHECK:         linalg.copy ins(%[[vB]] : memref<?x?xf32, strided<[?, 1], offset: ?>>) outs(%[[partialB]] : memref<?x?xf32, strided<[?, 1], offset: ?>>)
-//       CHECK:         linalg.copy ins(%[[vC]] : memref<?x?xf32, strided<[?, 1], offset: ?>>) outs(%[[partialC]] : memref<?x?xf32, strided<[?, 1], offset: ?>>)
+//       CHECK-NOT:     linalg.copy ins(%[[vC]] : memref<?x?xf32, strided<[?, 1], offset: ?>>) outs(%[[partialC]] : memref<?x?xf32, strided<[?, 1], offset: ?>>)
----------------
AviadCo wrote:

Why do we need to promote the output? does matmul use the outputs in its calculation? if so, I believe we want it to be `InitTensor`.

https://github.com/llvm/llvm-project/pull/69876