[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 29 09:37:30 PDT 2021
foad added a comment.
In general I am uncomfortable about generating code that does not work (i.e. expanding spills the way we do when exec might be 0) and then running yet another pass for correctness to fix it up later. Is there a way this can be made correct by default, and if necessary run an extra pass that optimizes it for efficiency?
Anyway I am not an expert in this area. I am happy to be overruled by people who know more about it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:1231
addPass(&BranchRelaxationPassID);
+ addPass(&SIAvoidZeroExecMaskID);
}
----------------
I don't think you can put anything that inserts extra instructions after BranchRelaxation. Putting it after the hazard recognizer might be risky too. Am I right in thinking that you just want to run this after spills are lowered to real code (prologue-epilogue insertion?)?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99507/new/
https://reviews.llvm.org/D99507
More information about the llvm-commits
mailing list