[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 29 17:15:32 PDT 2021


critson added a comment.

In D99507#2656780 <https://reviews.llvm.org/D99507#2656780>, @arsenm wrote:

> Instead of having a fixup patch to avoid cases where this happens, we should have the infrastructure to stop this from happening in the first place

This is my position as well.



================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:201
+
+  Register Tmp = MRI.createVirtualRegister(&AMDGPU::SReg_64RegClass);
+  MachineInstr *Cloned = nullptr;
----------------
Presumably this needs to depend on IsWave32.


================
Comment at: llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp:221
+  unsigned Scav = RS->scavengeRegisterBackwards(
+      AMDGPU::SReg_64RegClass, MachineBasicBlock::iterator(Cloned), false, 0);
+  MRI.replaceRegWith(Tmp, Scav);
----------------
This also needs to depends on IsWave32.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99507/new/

https://reviews.llvm.org/D99507



More information about the llvm-commits mailing list