[Mlir-commits] [mlir] [mlir][memref] Add HoistCastPos pattern to castOp (PR #168337)
lonely eagle
llvmlistbot at llvm.org
Tue Nov 18 06:24:42 PST 2025
linuxlonelyeagle wrote:
> We could consider that "cast are canonicalized to be closest to their definition" or something like that, but:
>
> 1. I'm not entirely sure about this one: what makes cast "special" with this property? Why not other ops? This likely requires more opinions here.
You are right. This issue has evolved from the original problem into "How to use the CSE pass more efficiently?".
https://discourse.llvm.org/t/will-ops-without-side-effects-be-reordered-when-running-the-pass/85222/
As we discussed earlier, `if an Op is a Pure Op, we have the opportunity to hoist its position.`
* How to use the CSE pass more efficiently?
Following code CSE don't work.
```
func.func @hoist_cast_pos(%arg: memref<10xf32>, %arg1: i1) -> (memref<?xf32>) {
cf.cond_br %arg1, ^bb1, ^bb2
^bb1:
%cast = memref.cast %arg : memref<10xf32> to memref<?xf32>
return %cast : memref<?xf32>
^bb2:
%cast1 = memref.cast %arg : memref<10xf32> to memref<?xf32>
return %cast1 : memref<?xf32>
}
```
CSE work on it.
```
func.func @hoist_cast_pos(%arg: memref<10xf32>, %arg1: i1) -> (memref<?xf32>) {
%cast = memref.cast %arg : memref<10xf32> to memref<?xf32>
%cast1 = memref.cast %arg : memref<10xf32> to memref<?xf32>
cf.cond_br %arg1, ^bb1, ^bb2
^bb1:
return %cast : memref<?xf32>
^bb2:
return %cast1 : memref<?xf32>
}
// run mlir-opt hoist_cast_pos.mlir -cse
func.func @hoist_cast_pos(%arg: memref<10xf32>, %arg1: i1) -> (memref<?xf32>) {
%cast = memref.cast %arg : memref<10xf32> to memref<?xf32>
cf.cond_br %arg1, ^bb1, ^bb2
^bb1:
return %cast : memref<?xf32>
^bb2:
return %cast : memref<?xf32>
}
```
We can implement a more generic pass based on SSA dominance to hoist the Pure op.To be perfectly honest, I'm not entirely sure how difficult it is.However, I'd be quite happy to make it happen. We do need to consider more people's suggestions regarding this issue.
> 2. You're only achieving this partially (only if in a different block).
For me, this implementation is already sufficient. "closest to their definition".It means they are in a block, so I can use CSE, right?
https://github.com/llvm/llvm-project/pull/168337
More information about the Mlir-commits
mailing list