[Mlir-commits] [mlir] Enable LICM for ops with only read side effects in scf.for (PR #120302)
Arda Unal
llvmlistbot at llvm.org
Mon Jan 6 16:26:47 PST 2025
ardaunal wrote:
I changed the approach as we discussed on [Speculative LICM?](https://discourse.llvm.org/t/speculative-licm/80977).
Following is different:
- Loop is no longer wrapped within a guard.
- Ops with only read side effect are hoisted with a guard. Else statement of this guard has the **ub.poison** op having the same type(s) with the op being hoisted.
- Pure ops are hoisted without a guard unless an op was hoisted with a guard before. Otherwise, pure op is hoisted with a guard. This is needed not to have interleaving branches such as:
```module {
func.func @test_speculatable_op_with_read_side_effect_success_with_dependents(%arg0: index, %arg1: index, %arg2: index) -> i32 {
%c0_i32 = arith.constant 0 : i32
%cst = arith.constant dense<42> : tensor<64xi32>
%c42 = arith.constant 42 : index
%0 = "test.always_speculatable_op"() : () -> i32
%1 = arith.cmpi ult, %arg0, %arg1 : index
%2 = scf.if %1 -> (i32) {
%8 = "test.speculatable_op_with_memread"(%cst, %c42) : (tensor<64xi32>, index) -> i32
scf.yield %8 : i32
} else {
%8 = ub.poison : i32
scf.yield %8 : i32
}
%3 = arith.addi %0, %2 : i32
%4 = arith.cmpi ult, %arg0, %arg1 : index
%5 = scf.if %4 -> (i32) {
%8 = "test.speculatable_op_with_memread"(%cst, %c42) : (tensor<64xi32>, index) -> i32
scf.yield %8 : i32
} else {
%8 = ub.poison : i32
scf.yield %8 : i32
}
%6 = arith.addi %3, %5 : i32
%7 = scf.for %arg3 = %arg0 to %arg1 step %arg2 iter_args(%arg4 = %c0_i32) -> (i32) {
%8 = arith.index_cast %arg3 : index to i32
%9 = arith.addi %6, %8 : i32
scf.yield %9 : i32
}
return %7 : i32
}
}
```
so that CSE and canonicalizer can do their job to get the following instead:
```module {
func.func @test_speculatable_op_with_read_side_effect_success_with_dependents(%arg0: index, %arg1: index, %arg2: index) -> i32 {
%0 = ub.poison : i32
%c0_i32 = arith.constant 0 : i32
%cst = arith.constant dense<42> : tensor<64xi32>
%c42 = arith.constant 42 : index
%1 = "test.always_speculatable_op"() : () -> i32
%2 = arith.cmpi ult, %arg0, %arg1 : index
%3 = scf.if %2 -> (i32) {
%5 = "test.speculatable_op_with_memread"(%cst, %c42) : (tensor<64xi32>, index) -> i32
%6 = arith.addi %1, %5 : i32
%7 = "test.speculatable_op_with_memread"(%cst, %c42) : (tensor<64xi32>, index) -> i32
%8 = arith.addi %6, %7 : i32
scf.yield %8 : i32
} else {
scf.yield %0 : i32
}
%4 = scf.for %arg3 = %arg0 to %arg1 step %arg2 iter_args(%arg4 = %c0_i32) -> (i32) {
%5 = arith.index_cast %arg3 : index to i32
%6 = arith.addi %3, %5 : i32
scf.yield %6 : i32
}
return %4 : i32
}
}
```
- There is only one new interface function `moveOutOfLoopWithGuard` which is implemented by **scf.for** for now. Implementation for **affine.for** should be similar.
https://github.com/llvm/llvm-project/pull/120302
More information about the Mlir-commits
mailing list