[Mlir-commits] [mlir] [mlir][bufferization] Remove `buffer-deallocation` pass (PR #126366)
llvmlistbot at llvm.org
llvmlistbot at llvm.org
Sat Feb 8 02:55:47 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-mlir-bufferization
Author: Matthias Springer (matthias-springer)
<details>
<summary>Changes</summary>
The `-buffer-deallocation` pass is not compatible with One-Shot Bufferize and has been replaced with the Ownership-based Buffer Deallocation pass about 1.5 years ago. To clean up the code base, this commit removes the deprecated `buffer-deallocation` pass. All uses of this deprecated pass within MLIR have already been migrated.
Note for LLVM integration: If you depend on this pass, migrate to the Ownership-based Buffer Deallocation pass or copy the pass to your codebase. For details, see https://discourse.llvm.org/t/psa-bufferization-new-buffer-deallocation-pipeline/73375.
---
Patch is 221.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126366.diff
13 Files Affected:
- (removed) mlir/docs/BufferDeallocationInternals.md (-705)
- (modified) mlir/docs/OwnershipBasedBufferDeallocation.md (+1-3)
- (removed) mlir/docs/includes/img/branch_example_post_move.svg (-419)
- (removed) mlir/docs/includes/img/branch_example_pre_move.svg (-409)
- (removed) mlir/docs/includes/img/nested_branch_example_post_move.svg (-759)
- (removed) mlir/docs/includes/img/nested_branch_example_pre_move.svg (-717)
- (removed) mlir/docs/includes/img/region_branch_example_pre_move.svg (-435)
- (modified) mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h (-7)
- (modified) mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td (+3-75)
- (removed) mlir/lib/Dialect/Bufferization/Transforms/BufferDeallocation.cpp (-693)
- (modified) mlir/lib/Dialect/Bufferization/Transforms/CMakeLists.txt (-1)
- (removed) mlir/test/Dialect/Bufferization/Transforms/buffer-deallocation.mlir (-1462)
- (modified) mlir/test/Pass/pipeline-invalid.mlir (+2-2)
``````````diff
diff --git a/mlir/docs/BufferDeallocationInternals.md b/mlir/docs/BufferDeallocationInternals.md
deleted file mode 100644
index 00830ba9d2dc2e3..000000000000000
--- a/mlir/docs/BufferDeallocationInternals.md
+++ /dev/null
@@ -1,705 +0,0 @@
-# Buffer Deallocation - Internals
-
-**Note:** This pass is deprecated. Please use the ownership-based buffer
-deallocation pass instead.
-
-This section covers the internal functionality of the BufferDeallocation
-transformation. The transformation consists of several passes. The main pass
-called BufferDeallocation can be applied via “-buffer-deallocation” on MLIR
-programs.
-
-[TOC]
-
-## Requirements
-
-In order to use BufferDeallocation on an arbitrary dialect, several control-flow
-interfaces have to be implemented when using custom operations. This is
-particularly important to understand the implicit control-flow dependencies
-between different parts of the input program. Without implementing the following
-interfaces, control-flow relations cannot be discovered properly and the
-resulting program can become invalid:
-
-* Branch-like terminators should implement the `BranchOpInterface` to query
- and manipulate associated operands.
-* Operations involving structured control flow have to implement the
- `RegionBranchOpInterface` to model inter-region control flow.
-* Terminators yielding values to their parent operation (in particular in the
- scope of nested regions within `RegionBranchOpInterface`-based operations),
- should implement the `ReturnLike` trait to represent logical “value
- returns”.
-
-Example dialects that are fully compatible are the “std” and “scf” dialects with
-respect to all implemented interfaces.
-
-During Bufferization, we convert immutable value types (tensors) to mutable
-types (memref). This conversion is done in several steps and in all of these
-steps the IR has to fulfill SSA like properties. The usage of memref has to be
-in the following consecutive order: allocation, write-buffer, read- buffer. In
-this case, there are only buffer reads allowed after the initial full buffer
-write is done. In particular, there must be no partial write to a buffer after
-the initial write has been finished. However, partial writes in the initializing
-is allowed (fill buffer step by step in a loop e.g.). This means, all buffer
-writes needs to dominate all buffer reads.
-
-Example for breaking the invariant:
-
-```mlir
-func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>) {
- %0 = memref.alloc() : memref<2xf32>
- cf.cond_br %arg0, ^bb1, ^bb2
-^bb1:
- cf.br ^bb3()
-^bb2:
- partial_write(%0, %0)
- cf.br ^bb3()
-^bb3():
- test.copy(%0, %arg1) : (memref<2xf32>, memref<2xf32>) -> ()
- return
-}
-```
-
-The maintenance of the SSA like properties is only needed in the bufferization
-process. Afterwards, for example in optimization processes, the property is no
-longer needed.
-
-## Detection of Buffer Allocations
-
-The first step of the BufferDeallocation transformation is to identify
-manageable allocation operations that implement the `SideEffects` interface.
-Furthermore, these ops need to apply the effect `MemoryEffects::Allocate` to a
-particular result value while not using the resource
-`SideEffects::AutomaticAllocationScopeResource` (since it is currently reserved
-for allocations, like `Alloca` that will be automatically deallocated by a
-parent scope). Allocations that have not been detected in this phase will not be
-tracked internally, and thus, not deallocated automatically. However,
-BufferDeallocation is fully compatible with “hybrid” setups in which tracked and
-untracked allocations are mixed:
-
-```mlir
-func.func @mixedAllocation(%arg0: i1) {
- %0 = memref.alloca() : memref<2xf32> // aliases: %2
- %1 = memref.alloc() : memref<2xf32> // aliases: %2
- cf.cond_br %arg0, ^bb1, ^bb2
-^bb1:
- use(%0)
- cf.br ^bb3(%0 : memref<2xf32>)
-^bb2:
- use(%1)
- cf.br ^bb3(%1 : memref<2xf32>)
-^bb3(%2: memref<2xf32>):
- ...
-}
-```
-
-Example of using a conditional branch with alloc and alloca. BufferDeallocation
-can detect and handle the different allocation types that might be intermixed.
-
-Note: the current version does not support allocation operations returning
-multiple result buffers.
-
-## Conversion from AllocOp to AllocaOp
-
-The PromoteBuffersToStack-pass converts AllocOps to AllocaOps, if possible. In
-some cases, it can be useful to use such stack-based buffers instead of
-heap-based buffers. The conversion is restricted to several constraints like:
-
-* Control flow
-* Buffer Size
-* Dynamic Size
-
-If a buffer is leaving a block, we are not allowed to convert it into an alloca.
-If the size of the buffer is large, we could convert it, but regarding stack
-overflow, it makes sense to limit the size of these buffers and only convert
-small ones. The size can be set via a pass option. The current default value is
-1KB. Furthermore, we can not convert buffers with dynamic size, since the
-dimension is not known a priori.
-
-## Movement and Placement of Allocations
-
-Using the buffer hoisting pass, all buffer allocations are moved as far upwards
-as possible in order to group them and make upcoming optimizations easier by
-limiting the search space. Such a movement is shown in the following graphs. In
-addition, we are able to statically free an alloc, if we move it into a
-dominator of all of its uses. This simplifies further optimizations (e.g. buffer
-fusion) in the future. However, movement of allocations is limited by external
-data dependencies (in particular in the case of allocations of dynamically
-shaped types). Furthermore, allocations can be moved out of nested regions, if
-necessary. In order to move allocations to valid locations with respect to their
-uses only, we leverage Liveness information.
-
-The following code snippets shows a conditional branch before running the
-BufferHoisting pass:
-
-
-
-```mlir
-func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
- cf.cond_br %arg0, ^bb1, ^bb2
-^bb1:
- cf.br ^bb3(%arg1 : memref<2xf32>)
-^bb2:
- %0 = memref.alloc() : memref<2xf32> // aliases: %1
- use(%0)
- cf.br ^bb3(%0 : memref<2xf32>)
-^bb3(%1: memref<2xf32>): // %1 could be %0 or %arg1
- test.copy(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
- return
-}
-```
-
-Applying the BufferHoisting pass on this program results in the following piece
-of code:
-
-
-
-```mlir
-func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
- %0 = memref.alloc() : memref<2xf32> // moved to bb0
- cf.cond_br %arg0, ^bb1, ^bb2
-^bb1:
- cf.br ^bb3(%arg1 : memref<2xf32>)
-^bb2:
- use(%0)
- cf.br ^bb3(%0 : memref<2xf32>)
-^bb3(%1: memref<2xf32>):
- test.copy(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
- return
-}
-```
-
-The alloc is moved from bb2 to the beginning and it is passed as an argument to
-bb3.
-
-The following example demonstrates an allocation using dynamically shaped types.
-Due to the data dependency of the allocation to %0, we cannot move the
-allocation out of bb2 in this case:
-
-```mlir
-func.func @condBranchDynamicType(
- %arg0: i1,
- %arg1: memref<?xf32>,
- %arg2: memref<?xf32>,
- %arg3: index) {
- cf.cond_br %arg0, ^bb1, ^bb2(%arg3: index)
-^bb1:
- cf.br ^bb3(%arg1 : memref<?xf32>)
-^bb2(%0: index):
- %1 = memref.alloc(%0) : memref<?xf32> // cannot be moved upwards to the data
- // dependency to %0
- use(%1)
- cf.br ^bb3(%1 : memref<?xf32>)
-^bb3(%2: memref<?xf32>):
- test.copy(%2, %arg2) : (memref<?xf32>, memref<?xf32>) -> ()
- return
-}
-```
-
-## Introduction of Clones
-
-In order to guarantee that all allocated buffers are freed properly, we have to
-pay attention to the control flow and all potential aliases a buffer allocation
-can have. Since not all allocations can be safely freed with respect to their
-aliases (see the following code snippet), it is often required to introduce
-copies to eliminate them. Consider the following example in which the
-allocations have already been placed:
-
-```mlir
-func.func @branch(%arg0: i1) {
- %0 = memref.alloc() : memref<2xf32> // aliases: %2
- cf.cond_br %arg0, ^bb1, ^bb2
-^bb1:
- %1 = memref.alloc() : memref<2xf32> // resides here for demonstration purposes
- // aliases: %2
- cf.br ^bb3(%1 : memref<2xf32>)
-^bb2:
- use(%0)
- cf.br ^bb3(%0 : memref<2xf32>)
-^bb3(%2: memref<2xf32>):
- …
- return
-}
-```
-
-The first alloc can be safely freed after the live range of its post-dominator
-block (bb3). The alloc in bb1 has an alias %2 in bb3 that also keeps this buffer
-alive until the end of bb3. Since we cannot determine the actual branches that
-will be taken at runtime, we have to ensure that all buffers are freed correctly
-in bb3 regardless of the branches we will take to reach the exit block. This
-makes it necessary to introduce a copy for %2, which allows us to free %alloc0
-in bb0 and %alloc1 in bb1. Afterwards, we can continue processing all aliases of
-%2 (none in this case) and we can safely free %2 at the end of the sample
-program. This sample demonstrates that not all allocations can be safely freed
-in their associated post-dominator blocks. Instead, we have to pay attention to
-all of their aliases.
-
-Applying the BufferDeallocation pass to the program above yields the following
-result:
-
-```mlir
-func.func @branch(%arg0: i1) {
- %0 = memref.alloc() : memref<2xf32>
- cf.cond_br %arg0, ^bb1, ^bb2
-^bb1:
- %1 = memref.alloc() : memref<2xf32>
- %3 = bufferization.clone %1 : (memref<2xf32>) -> (memref<2xf32>)
- memref.dealloc %1 : memref<2xf32> // %1 can be safely freed here
- cf.br ^bb3(%3 : memref<2xf32>)
-^bb2:
- use(%0)
- %4 = bufferization.clone %0 : (memref<2xf32>) -> (memref<2xf32>)
- cf.br ^bb3(%4 : memref<2xf32>)
-^bb3(%2: memref<2xf32>):
- …
- memref.dealloc %2 : memref<2xf32> // free temp buffer %2
- memref.dealloc %0 : memref<2xf32> // %0 can be safely freed here
- return
-}
-```
-
-Note that a temporary buffer for %2 was introduced to free all allocations
-properly. Note further that the unnecessary allocation of %3 can be easily
-removed using one of the post-pass transformations or the canonicalization pass.
-
-The presented example also works with dynamically shaped types.
-
-BufferDeallocation performs a fix-point iteration taking all aliases of all
-tracked allocations into account. We initialize the general iteration process
-using all tracked allocations and their associated aliases. As soon as we
-encounter an alias that is not properly dominated by our allocation, we mark
-this alias as *critical* (needs to be freed and tracked by the internal
-fix-point iteration). The following sample demonstrates the presence of critical
-and non-critical aliases:
-
-
-
-```mlir
-func.func @condBranchDynamicTypeNested(
- %arg0: i1,
- %arg1: memref<?xf32>, // aliases: %3, %4
- %arg2: memref<?xf32>,
- %arg3: index) {
- cf.cond_br %arg0, ^bb1, ^bb2(%arg3: index)
-^bb1:
- cf.br ^bb6(%arg1 : memref<?xf32>)
-^bb2(%0: index):
- %1 = memref.alloc(%0) : memref<?xf32> // cannot be moved upwards due to the data
- // dependency to %0
- // aliases: %2, %3, %4
- use(%1)
- cf.cond_br %arg0, ^bb3, ^bb4
-^bb3:
- cf.br ^bb5(%1 : memref<?xf32>)
-^bb4:
- cf.br ^bb5(%1 : memref<?xf32>)
-^bb5(%2: memref<?xf32>): // non-crit. alias of %1, since %1 dominates %2
- cf.br ^bb6(%2 : memref<?xf32>)
-^bb6(%3: memref<?xf32>): // crit. alias of %arg1 and %2 (in other words %1)
- cf.br ^bb7(%3 : memref<?xf32>)
-^bb7(%4: memref<?xf32>): // non-crit. alias of %3, since %3 dominates %4
- test.copy(%4, %arg2) : (memref<?xf32>, memref<?xf32>) -> ()
- return
-}
-```
-
-Applying BufferDeallocation yields the following output:
-
-
-
-```mlir
-func.func @condBranchDynamicTypeNested(
- %arg0: i1,
- %arg1: memref<?xf32>,
- %arg2: memref<?xf32>,
- %arg3: index) {
- cf.cond_br %arg0, ^bb1, ^bb2(%arg3 : index)
-^bb1:
- // temp buffer required due to alias %3
- %5 = bufferization.clone %arg1 : (memref<?xf32>) -> (memref<?xf32>)
- cf.br ^bb6(%5 : memref<?xf32>)
-^bb2(%0: index):
- %1 = memref.alloc(%0) : memref<?xf32>
- use(%1)
- cf.cond_br %arg0, ^bb3, ^bb4
-^bb3:
- cf.br ^bb5(%1 : memref<?xf32>)
-^bb4:
- cf.br ^bb5(%1 : memref<?xf32>)
-^bb5(%2: memref<?xf32>):
- %6 = bufferization.clone %1 : (memref<?xf32>) -> (memref<?xf32>)
- memref.dealloc %1 : memref<?xf32>
- cf.br ^bb6(%6 : memref<?xf32>)
-^bb6(%3: memref<?xf32>):
- cf.br ^bb7(%3 : memref<?xf32>)
-^bb7(%4: memref<?xf32>):
- test.copy(%4, %arg2) : (memref<?xf32>, memref<?xf32>) -> ()
- memref.dealloc %3 : memref<?xf32> // free %3, since %4 is a non-crit. alias of %3
- return
-}
-```
-
-Since %3 is a critical alias, BufferDeallocation introduces an additional
-temporary copy in all predecessor blocks. %3 has an additional (non-critical)
-alias %4 that extends the live range until the end of bb7. Therefore, we can
-free %3 after its last use, while taking all aliases into account. Note that %4
-does not need to be freed, since we did not introduce a copy for it.
-
-The actual introduction of buffer copies is done after the fix-point iteration
-has been terminated and all critical aliases have been detected. A critical
-alias can be either a block argument or another value that is returned by an
-operation. Copies for block arguments are handled by analyzing all predecessor
-blocks. This is primarily done by querying the `BranchOpInterface` of the
-associated branch terminators that can jump to the current block. Consider the
-following example which involves a simple branch and the critical block argument
-%2:
-
-```mlir
- custom.br ^bb1(..., %0, : ...)
- ...
- custom.br ^bb1(..., %1, : ...)
- ...
-^bb1(%2: memref<2xf32>):
- ...
-```
-
-The `BranchOpInterface` allows us to determine the actual values that will be
-passed to block bb1 and its argument %2 by analyzing its predecessor blocks.
-Once we have resolved the values %0 and %1 (that are associated with %2 in this
-sample), we can introduce a temporary buffer and clone its contents into the new
-buffer. Afterwards, we rewire the branch operands to use the newly allocated
-buffer instead. However, blocks can have implicitly defined predecessors by
-parent ops that implement the `RegionBranchOpInterface`. This can be the case if
-this block argument belongs to the entry block of a region. In this setting, we
-have to identify all predecessor regions defined by the parent operation. For
-every region, we need to get all terminator operations implementing the
-`ReturnLike` trait, indicating that they can branch to our current block.
-Finally, we can use a similar functionality as described above to add the
-temporary copy. This time, we can modify the terminator operands directly
-without touching a high-level interface.
-
-Consider the following inner-region control-flow sample that uses an imaginary
-“custom.region_if” operation. It either executes the “then” or “else” region and
-always continues to the “join” region. The “custom.region_if_yield” operation
-returns a result to the parent operation. This sample demonstrates the use of
-the `RegionBranchOpInterface` to determine predecessors in order to infer the
-high-level control flow:
-
-```mlir
-func.func @inner_region_control_flow(
- %arg0 : index,
- %arg1 : index) -> memref<?x?xf32> {
- %0 = memref.alloc(%arg0, %arg0) : memref<?x?xf32>
- %1 = custom.region_if %0 : memref<?x?xf32> -> (memref<?x?xf32>)
- then(%arg2 : memref<?x?xf32>) { // aliases: %arg4, %1
- custom.region_if_yield %arg2 : memref<?x?xf32>
- } else(%arg3 : memref<?x?xf32>) { // aliases: %arg4, %1
- custom.region_if_yield %arg3 : memref<?x?xf32>
- } join(%arg4 : memref<?x?xf32>) { // aliases: %1
- custom.region_if_yield %arg4 : memref<?x?xf32>
- }
- return %1 : memref<?x?xf32>
-}
-```
-
-
-
-Non-block arguments (other values) can become aliases when they are returned by
-dialect-specific operations. BufferDeallocation supports this behavior via the
-`RegionBranchOpInterface`. Consider the following example that uses an “scf.if”
-operation to determine the value of %2 at runtime which creates an alias:
-
-```mlir
-func.func @nested_region_control_flow(%arg0 : index, %arg1 : index) -> memref<?x?xf32> {
- %0 = arith.cmpi "eq", %arg0, %arg1 : index
- %1 = memref.alloc(%arg0, %arg0) : memref<?x?xf32>
- %2 = scf.if %0 -> (memref<?x?xf32>) {
- scf.yield %1 : memref<?x?xf32> // %2 will be an alias of %1
- } else {
- %3 = memref.alloc(%arg0, %arg1) : memref<?x?xf32> // nested allocation in a div.
- // branch
- use(%3)
- scf.yield %1 : memref<?x?xf32> // %2 will be an alias of %1
- }
- return %2 : memref<?x?xf32>
-}
-```
-
-In this example, a dealloc is inserted to release the buffer within the else
-block since it cannot be accessed by the remainder of the program. Accessing the
-`RegionBranchOpInterface`, allows us to infer that %2 is a non-critical alias of
-%1 which does not need to be tracked.
-
-```mlir
-func.func @nested_region_control_flow(%arg0: index, %arg1: index) -> memref<?x?xf32> {
- %0 = arith.cmpi "eq", %arg0, %arg1 : index
- %1 = memref.alloc(%arg0, %arg0) : memref<?x?xf32>
- %2 = scf.if %0 -> (memref<?x?xf32>) {
- scf.yield %1 : memref<?x?xf32>
- } else {
- %3 = memref.alloc(%arg0, %arg1) : memref<?x?xf32>
- use(%3)
- memref.dealloc %3 : memref<?x?xf32> // %3 can be safely freed here
- scf.yield %1 : memref<?x?xf32>
- }
- return %2 : memref<?x?xf32>
-}
-```
-
-Analogous to the previous case, we have to detect all terminator operations in
-all attached regions of “scf.if” that provides a value to its parent operation
-(in this sample via scf.yield). Querying the `RegionBranchOpInterface` allows us
-to determine the regions that “return” a result to their parent operation. Like
-before, we have to update all `ReturnLike` terminators as described above.
-Reconsider a slightly adapted version of the “custom.region_if” example from
-above that uses a nested allocation:
-
-```mlir
-func.func @inner_region_control_flow_div(
- %arg0 : index,
- %arg1 : index) -> memref<?x?xf32> {
- %0 = memref.alloc(%arg0, %arg0) : memref<?x?xf32>
- %1 = custom.region_if %0 : memref<?x?xf32> -> (memref<?x?xf32>)
- then(%arg2 : memref<?x?xf32>) { // aliases: %arg4, %1
- custom.region_if_yield %arg2 : memref<?x?xf32>
- } else(%arg3 : memref<?x?xf32>) {
- %2 = memref.alloc(%arg0, %arg1) : memref<?x?xf32> // aliases: %arg4, %1
- custom.region_if_yield %2 : memref<?x?xf32>
- } join(%arg4 : memref<?x?xf32>) { // aliases: %1
- custom.region_if_yield %arg4 : memref<?x?xf32>
- }
- return %1 : memref<?x?xf32>
-}
-```
-
-Since the allocation %2 happens in a divergent branch and cannot be safely
-deallocated in a post-dominator, %arg4 will be considered a critical alias.
-Furthermore, %arg4 is returned to its parent operation and has an alias %1. This
-causes BufferDeallocation to introduce additional copies:
-
-```mlir
-func.func @inner_region_control_flow_div(
- %arg0 : index,
- %arg1 : index) -> memref<?x?xf32> {
- %0 = memref.alloc(%arg0, %arg0) : memref<?x?xf32>
- %1 = custom.region_if %0 : memref<?x?xf32> -> (memref<?x?xf32>)
- then(%arg2 : memref<?x?xf32>) {
- %4 = bufferization.clone %arg2 : (memref<?x?xf32>) -> (memref<?x?xf32>)
- custom.region_if_yield %4 : memref<?x?xf32>
- } else(%arg3 : memref<?x?xf32>) {
- %2 = memref.alloc(%arg0, %arg1) : memref<?x?xf32>
- %5 = bufferization.clone %2 : (memref<?x?xf32>) -> (memref<?x?xf32>)
- memref.dealloc %2 : memref<?x?xf32>
- custom.region_if_yield %5 : memref<?x?xf32>
- } join(%arg4: memref<?x?xf32>) {
- %4 = bufferization.clone %arg4 : (memref<?x?xf32>) -> (memref<?x?xf32>)
- memref.dealloc %arg4 : memref<?x?xf32>
- custom.region_if_yield %4 : memref<?x?xf32>
- }
- memref.dealloc %0 : memref<?x?xf32> // %0 can be s...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/126366
More information about the Mlir-commits
mailing list