[Mlir-commits] [mlir] [mlir][bufferization] Fix bug in bufferization of elementwise ops (PR #97209)
Matthias Springer
llvmlistbot at llvm.org
Sun Jun 30 04:57:06 PDT 2024
https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/97209
There is an optimization in One-Shot Bufferize wrt. ops that bufferize to elementwise access. A copy can sometimes be avoided. E.g.:
```
%0 = tensor.empty()
%1 = tensor.fill ...
%2 = linalg.map ins(%1, ...) outs(%1)
```
In the above example, a buffer copy is not needed for %1, even though the same buffer is read/written by two different operands (of the same op). That's because the op bufferizes to elementwise access.
```c++
// Two equivalent operands of the same op are not conflicting if the op
// bufferizes to element-wise access. I.e., all loads at a position
// happen before all stores to the same position.
```
This optimization cannot be applied when op dominance cannot be used to rule out conflicts. E.g., when the `linalg.map` is inside of a loop. In such a case, the reads/writes happen multiple times and it is not guaranteed that "all loads at a position happen before all stores to the same position."
Fixes #90019.
>From b91cc91f918e61adc9e55276fa43261c1ccceb23 Mon Sep 17 00:00:00 2001
From: Matthias Springer <mspringer at nvidia.com>
Date: Sun, 30 Jun 2024 13:50:08 +0200
Subject: [PATCH] [mlir][bufferization] Fix bug in bufferization of elementwise
ops
There is an optimization in One-Shot Bufferize wrt. ops that bufferize to elementwise access. In such cases, a copy can sometimes be avoided. E.g.:
```
%0 = tensor.empty()
%1 = tensor.fill ...
%2 = linalg.map ins(%1, ...) outs(%1)
```
In the above example, a buffer copy is not needed for %1, even though the same buffer is read/written by two different operand. That's because the op bufferizes to elementwise access.
```c++
// Two equivalent operands of the same op are not conflicting if the op
// bufferizes to element-wise access. I.e., all loads at a position
// happen before all stores to the same position.
```
This optimization cannot be applied when op dominance cannot be used to rule out conflicts. E.g., when the `linalg.map` is inside of a loop. In such a case, the reads/writes happen multiple times and it is not guaranteed that "all loads at a position happen before all stores to the same position."
Fixes #90019.
---
.../Transforms/OneShotAnalysis.cpp | 32 +++++++++----------
.../Linalg/one-shot-bufferize-analysis.mlir | 28 ++++++++++++++++
2 files changed, 44 insertions(+), 16 deletions(-)
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/OneShotAnalysis.cpp b/mlir/lib/Dialect/Bufferization/Transforms/OneShotAnalysis.cpp
index d0b4e0dd4383e..975bfb4d41e0b 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/OneShotAnalysis.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/OneShotAnalysis.cpp
@@ -725,23 +725,23 @@ hasReadAfterWriteInterference(const DenseSet<OpOperand *> &usesRead,
"mutually exclusive regions\n");
continue;
}
- }
- // Two equivalent operands of the same op are not conflicting if the op
- // bufferizes to element-wise access. I.e., all loads at a position happen
- // before all stores to the same position.
- if (conflictingWritingOp == readingOp) {
- if (auto bufferizableOp = options.dynCastBufferizableOp(readingOp)) {
- if (bufferizableOp.bufferizesToElementwiseAccess(
- state, {uRead, uConflictingWrite})) {
- if (hasEquivalentValueInReverseUseDefChain(
- state, uRead->get(), uConflictingWrite->get()) ||
- hasEquivalentValueInReverseUseDefChain(
- state, uConflictingWrite->get(), uRead->get())) {
- LLVM_DEBUG(
- llvm::dbgs()
- << " no conflict: op bufferizes to element-wise access\n");
- continue;
+ // Two equivalent operands of the same op are not conflicting if the op
+ // bufferizes to element-wise access. I.e., all loads at a position
+ // happen before all stores to the same position.
+ if (conflictingWritingOp == readingOp) {
+ if (auto bufferizableOp = options.dynCastBufferizableOp(readingOp)) {
+ if (bufferizableOp.bufferizesToElementwiseAccess(
+ state, {uRead, uConflictingWrite})) {
+ if (hasEquivalentValueInReverseUseDefChain(
+ state, uRead->get(), uConflictingWrite->get()) ||
+ hasEquivalentValueInReverseUseDefChain(
+ state, uConflictingWrite->get(), uRead->get())) {
+ LLVM_DEBUG(
+ llvm::dbgs()
+ << " no conflict: op bufferizes to element-wise access\n");
+ continue;
+ }
}
}
}
diff --git a/mlir/test/Dialect/Linalg/one-shot-bufferize-analysis.mlir b/mlir/test/Dialect/Linalg/one-shot-bufferize-analysis.mlir
index 2d79a80cddc2b..5b7c2baf9d84f 100644
--- a/mlir/test/Dialect/Linalg/one-shot-bufferize-analysis.mlir
+++ b/mlir/test/Dialect/Linalg/one-shot-bufferize-analysis.mlir
@@ -107,3 +107,31 @@ func.func @elementwise_no_conflict_4(%arg0: tensor<8x32x32x32xf32>, %arg1: tenso
}
return %r : tensor<8x32x32x32xf32>
}
+
+// -----
+
+// CHECK-LABEL: func @elementwise_access_regression(
+// CHECK: linalg.fill {__inplace_operands_attr__ = ["none", "false"]}
+// CHECK: linalg.map
+// CHECK-SAME: {__inplace_operands_attr__ = ["true", "true", "true"]}
+// CHECK: linalg.map
+// CHECK-SAME: {__inplace_operands_attr__ = ["true", "true", "true"]}
+func.func private @f(%arg: tensor<32x1xf32>) -> ()
+func.func @elementwise_access_regression(%arg0: i32, %arg2: tensor<32x1xf32>, %arg3: tensor<32x1xf32>) {
+ %cst_0 = arith.constant 0.000000e+00 : f32
+ %c0_i32 = arith.constant 0 : i32
+ %c1_i32 = arith.constant 1 : i32
+ %0 = tensor.empty() : tensor<32x1xf32>
+
+ // This op must bufferize out-of-place so that the filled tensor is not
+ // overwritten by the ops inside of the loop.
+ %1 = linalg.fill ins(%cst_0 : f32) outs(%0 : tensor<32x1xf32>) -> tensor<32x1xf32>
+
+ scf.for %arg1 = %c0_i32 to %arg0 step %c1_i32 : i32 {
+ %2 = linalg.map { arith.subf } ins(%1, %arg2 : tensor<32x1xf32>, tensor<32x1xf32>) outs(%0 : tensor<32x1xf32>)
+ %3 = tensor.empty() : tensor<32x1xf32>
+ %4 = linalg.map { arith.subf } ins(%2, %arg3 : tensor<32x1xf32>, tensor<32x1xf32>) outs(%3 : tensor<32x1xf32>)
+ func.call @f(%4) : (tensor<32x1xf32>) -> ()
+ }
+ return
+}
More information about the Mlir-commits
mailing list