[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Mon Mar 29 09:37:30 PDT 2021

foad added a comment.

In general I am uncomfortable about generating code that does not work (i.e. expanding spills the way we do when exec might be 0) and then running yet another pass for correctness to fix it up later. Is there a way this can be made correct by default, and if necessary run an extra pass that optimizes it for efficiency?

Anyway I am not an expert in this area. I am happy to be overruled by people who know more about it.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:1231
   addPass(&BranchRelaxationPassID);
+  addPass(&SIAvoidZeroExecMaskID);
 }
----------------
I don't think you can put anything that inserts extra instructions after BranchRelaxation. Putting it after the hazard recognizer might be risky too. Am I right in thinking that you just want to run this after spills are lowered to real code (prologue-epilogue insertion?)?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99507/new/

https://reviews.llvm.org/D99507