[llvm] [StructurizeCFG] Order IF Else block using Heuristics (PR #139605)
via llvm-commits
llvm-commits at lists.llvm.org
Wed May 14 07:26:46 PDT 2025
ruiling wrote:
It feels like we hit the limitation in the register coalescer that it cannot do coalescing in the optimal way, and it is hard to solve the problem there. But ordering the blocks does not feel like the optimal way to solve the problem. Think what if you have two simultaneous values, one is modified in `then` and the other is modified in `else`. Each order would only help one value and leave a copy for the other value, right?
If you look at the IR before structurization, there is only one value alive at any program point. But after structurization, we get something like:
```
%entry:
...
br i1 %cond, label %then, label %Flow
then: ; preds = %entry
%x = extractelement <4 x i32> %vec.load2, i32 0
%z = add i32 %x, 1
br label %Flow
Flow: ; preds = %then, %entry
%3 = phi i32 [ %z, %then ], [ poison, %entry ]
%4 = phi i1 [ false, %then ], [ true, %entry ]
br i1 %4, label %else, label %merge
else: ; preds = %Flow
%a = extractelement <4 x i32> %vec.load2, i32 1
br label %merge
merge: ; preds = %else, %Flow
%phi = phi i32 [ %3, %Flow ], [ %a, %else ]
```
This is quite like the problem I solved in #101301, but more challenging. Can we try to simplify the phi by moving the extractelemnt into dominator? Like:
```
%entry:
...
%a = extractelement <4 x i32> %vec.load2, i32 1
br i1 %cond, label %then, label %Flow
then: ; preds = %entry
%x = extractelement <4 x i32> %vec.load2, i32 0
%z = add i32 %x, 1
br label %Flow
Flow: ; preds = %then, %entry
%3 = phi i32 [ %z, %then ], [ %a, %entry ]
%4 = phi i1 [ false, %then ], [ true, %entry ]
br i1 %4, label %else, label %merge
else: ; preds = %Flow
; the extractelement was hoisted into the dominator.
br label %merge
merge: ; preds = %else, %Flow
%phi = phi i32 [ %3, %Flow ], [ %3, %else ] ; this is actually %3, it would be better we can avoid the phi, but it does not hurt to leave the phi.
```
By doing this, we still have only one copy of the value being alive on the CFG post-structurization.
If we allocate VGPR on thread-CFG, this would be trivially solved, but it is a large project.
https://github.com/llvm/llvm-project/pull/139605
More information about the llvm-commits
mailing list