[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Michael Liao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 29 09:56:06 PDT 2021


hliao added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:220
+  RS->enterBasicBlockEnd(MBB);
+  unsigned Scav = RS->scavengeRegisterBackwards(
+      AMDGPU::SReg_64RegClass, MachineBasicBlock::iterator(Cloned), false, 0);
----------------
foad wrote:
> What happens if there are no free sgprs so the scavenger has to spill something? That sounds like yet another case that won't work correctly when exec is zero.
Yeah, that's true. If we have a null-like reg pair, we could save the searching for that tmp SGPR as well. We have a null reg definition but we need an SGPR pair for the WAVE64 case. As we only duplicate that evaluation to update SCC, a null reg-pair would be sufficient.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99507/new/

https://reviews.llvm.org/D99507



More information about the llvm-commits mailing list