[Mlir-commits] [mlir] 4c807f2 - [mlir][vector] insert `alloca`s outside of loops

Alex Zinenko llvmlistbot at llvm.org
Mon Apr 25 01:49:16 PDT 2022


Author: Alex Zinenko
Date: 2022-04-25T10:49:09+02:00
New Revision: 4c807f2f579f4e5412c49c341230e309f2f79c9b

URL: https://github.com/llvm/llvm-project/commit/4c807f2f579f4e5412c49c341230e309f2f79c9b
DIFF: https://github.com/llvm/llvm-project/commit/4c807f2f579f4e5412c49c341230e309f2f79c9b.diff

LOG: [mlir][vector] insert `alloca`s outside of loops

After https://reviews.llvm.org/D119743 added the `AutomaticAllocationScope`
trait to loop-like constructs, the vector transfer full/partial splitting pass
started inserting allocations for temporaries within the closest loop rather
than the closest function (or other allocation scope such as `async.execute`).
While this is correct as long as the lowered code takes care of automatic
deallocation at the end of each iteration of the loop, this interferes with
downstream optimizations that expect `alloca`s to be at the function level.
Step over loops when looking for the closest allocation scope in vector
transfer full/partial splitting pass thus restoring the original behavior.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D124366

Added: 
    

Modified: 
    mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
    mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir

Removed: 
    


################################################################################
diff  --git a/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp b/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
index 5e090a6ccc718..d6469910b6a76 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
@@ -441,8 +441,17 @@ static void createFullPartialVectorTransferWrite(RewriterBase &b,
 
 // TODO: Parallelism and threadlocal considerations with a ParallelScope trait.
 static Operation *getAutomaticAllocationScope(Operation *op) {
-  Operation *scope =
-      op->getParentWithTrait<OpTrait::AutomaticAllocationScope>();
+  // Find the closest surrounding allocation scope that is not a known looping
+  // construct (putting alloca's in loops doesn't always lower to deallocation
+  // until the end of the loop).
+  Operation *scope = nullptr;
+  for (Operation *parent = op->getParentOp(); parent != nullptr;
+       parent = parent->getParentOp()) {
+    if (parent->hasTrait<OpTrait::AutomaticAllocationScope>())
+      scope = parent;
+    if (!isa<scf::ForOp, AffineForOp>(parent))
+      break;
+  }
   assert(scope && "Expected op to be inside automatic allocation scope");
   return scope;
 }

diff  --git a/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir b/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
index eb04f87d2eebf..b4abb17717f3a 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
@@ -412,3 +412,23 @@ func.func @transfer_read_within_async_execute(%A : memref<?x?xf32>) -> !async.to
   }
   return %token : !async.token
 }
+
+// -----
+
+func.func private @fake_side_effecting_fun(%0: vector<2x2xf32>) -> ()
+
+// Ensure that `alloca`s are inserted outside of loops even though loops are
+// consdered allocation scopes.
+// CHECK-LABEL: transfer_read_within_scf_for
+func.func @transfer_read_within_scf_for(%A : memref<?x?xf32>, %lb : index, %ub : index, %step : index) {
+  %c0 = arith.constant 0 : index
+  %f0 = arith.constant 0.0 : f32
+  // CHECK: alloca
+  // CHECK: scf.for
+  // CHECK-NOT: alloca
+  scf.for %i = %lb to %ub step %step {
+    %0 = vector.transfer_read %A[%c0, %c0], %f0 : memref<?x?xf32>, vector<2x2xf32>
+    func.call @fake_side_effecting_fun(%0) : (vector<2x2xf32>) -> ()
+  }
+  return
+}


        


More information about the Mlir-commits mailing list