[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 29 17:15:32 PDT 2021
critson added a comment.
In D99507#2656780 <https://reviews.llvm.org/D99507#2656780>, @arsenm wrote:
> Instead of having a fixup patch to avoid cases where this happens, we should have the infrastructure to stop this from happening in the first place
This is my position as well.
================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:201
+
+ Register Tmp = MRI.createVirtualRegister(&AMDGPU::SReg_64RegClass);
+ MachineInstr *Cloned = nullptr;
----------------
Presumably this needs to depend on IsWave32.
================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:221
+ unsigned Scav = RS->scavengeRegisterBackwards(
+ AMDGPU::SReg_64RegClass, MachineBasicBlock::iterator(Cloned), false, 0);
+ MRI.replaceRegWith(Tmp, Scav);
----------------
This also needs to depends on IsWave32.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99507/new/
https://reviews.llvm.org/D99507
More information about the llvm-commits
mailing list