[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Mar 20 19:54:47 PDT 2022
critson accepted this revision.
critson added a comment.
This revision is now accepted and ready to land.
LGTM, with a few minor nits.
Also please double check all comments in the code are up to date, as there has been a lot revisions.
Thanks!
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:298
+// condition based on the current MachineInstr, for instance an target
+// instruction. Breaks prematurely by returning nullptr if DisallowDefBetween is
+// true and one of the registers given in NonModifiableRegs is modified by the
----------------
Comment mentions code that no longer exists `DisallowDefBetween`?
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:653
+ // to reduce pipeline stalls.
+ if (ST.hasGFX10_3Insts()) {
+ DenseMap<MachineInstr *, MachineInstr *> SaveExecVCmpMapping;
----------------
Could this be more future proof?
e.g.
`if (AMDGPU::isGFX10Plus(ST) && !ST.hasVcmpxExecWARHazard())`
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:659
+ for (MachineBasicBlock &MBB : MF) {
+ for (MachineInstr &MI : MBB) {
+ // Try to record existing s_and_saveexec instructions, iff
----------------
It's unfortunate that we have to scan all instructions to find these.
Normally they should be within an instruction or two of the terminators.
However, I can see that the waterfall code violates the principle of EXEC manipulation at the end of block right now, so we don't have a choice.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119696/new/
https://reviews.llvm.org/D119696
More information about the llvm-commits
mailing list