[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors (PR #108596)

Mon Sep 16 10:41:27 PDT 2024

alex-t wrote:

Honestly, I don't like the way this change affects the code generation quality. I would not ever try to propose it provided we have another way to achieve correctness. Currently, we have a compiler that either produces incorrect code or fails to compile a valid input.
Speaking about the long-term solution I would look to avoid any code insertion after the CF pseudos are lowered to the exec mask manipulation. The LLVM register allocation framework was designed for CPU targets and considers a CF model based on branches. The spill/split placing mechanisms make their decisions based on the edge bundles as boundaries for the in/out values.  In our case, for the flow blocks, the input boundary is, in fact, inside the block, right after the exec mask restoring instruction. So, it seems like we are going to keep having more and more trouble with spill/split insertions until we put this logic out of there.

https://github.com/llvm/llvm-project/pull/108596