[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 30 06:30:57 PDT 2021


arsenm added a comment.

In D99507#2658404 <https://reviews.llvm.org/D99507#2658404>, @hliao wrote:

> It seems to me that we may need to revise CFG lowering to avoid updating EXEC directly and later revise it based on whether the restoring mask needs reloading or not. Here's the brief thought in my mind:
>
> - Instead of lowering CFG early before RA, lower it after RA. As a byproduct, it also remove the need of "terminator" version of exec mask manipulation instructions.
> - When CFG is being lowered, it could update EXEC eagerly if the merge point doesn't need to reload the mask; Otherwise, it just needs to translate as what we currently did.
>
> Any suggestions and comments?

I think this requires a lot more thought. I think we need deeper IR changes. I believe MachineBasicBlock needs to start tracking both uniform and divergent predecessors/successors, but not sure what this ends up looking like. I think the terminators should still reflect the hardware instructions, and the exec issues would be tracked through the divergent pred/succ.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99507/new/

https://reviews.llvm.org/D99507



More information about the llvm-commits mailing list