[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 29 09:58:57 PDT 2021
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:220
+ RS->enterBasicBlockEnd(MBB);
+ unsigned Scav = RS->scavengeRegisterBackwards(
+ AMDGPU::SReg_64RegClass, MachineBasicBlock::iterator(Cloned), false, 0);
----------------
hliao wrote:
> foad wrote:
> > What happens if there are no free sgprs so the scavenger has to spill something? That sounds like yet another case that won't work correctly when exec is zero.
> Yeah, that's true. If we have a null-like reg pair, we could save the searching for that tmp SGPR as well. We have a null reg definition but we need an SGPR pair for the WAVE64 case. As we only duplicate that evaluation to update SCC, a null reg-pair would be sufficient.
NULL works for 64 bit pair too. Although it is not available on every target.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99507/new/
https://reviews.llvm.org/D99507
More information about the llvm-commits
mailing list