[PATCH] D149651: [UnreachableBlockElim] Don't remove LCSSA phi nodes

Fri May 5 00:30:18 PDT 2023

ruiling added a comment.

I guess you are observing code generation bug for AMDGPU? Is it replacing a phi like `%1:vgpr = phi %0:sgpr` with a `%1:vgpr = COPY %0:sgpr`? If that is the case, I think this does not sound like a root-fix.

If we have some MachineIR like below (bb1 is a self-loop):

   entry
    |
  [ bb1  bb2
     \    /
       bb3

The unreachable block `bb2` will  be eliminated and the phi in `bb3` will then be further simplified the same way as in the LCSSA case you are seeing.

I think the problem is for AMDGPU, we depends on the sgpr to vgpr copy lowered from phi should be in predecessor block, which is the way PHIElimination lowers phi. This is mainly because when the predecessor block is inside a loop, the COPY in predecessor block would be executed totally different from a COPY in the successor block.
For this specific issue, I think we can teach the pass here to insert COPY in the predecessor block as PHIElimination. This should not hurt other target as this is the standard way to lower phi and the COPY for such case would be coalesced away later.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149651/new/

https://reviews.llvm.org/D149651