[PATCH] D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask.

Michael Liao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 29 07:38:09 PDT 2021


hliao created this revision.
hliao added reviewers: rampitec, arsenm, critson, tpr, foad, sameerds, sebastian-ne.
Herald added subscribers: kerbowa, hiraditya, t-tye, dstuttard, yaxunl, mgorny, nhaehnle, jvesely, kzhuravl, qcolombet, MatzeB.
hliao requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

- For such blocks where the mask is restored from a reloaded mask, zero exec mask results in the undefined behavior as the SGPR reload uses `v_readfirstlane`. Avoid such cases by transforms `s_cbranch_execz` and `s_cbranch_execnz` into equivalent branches without evaluating exec mask too eager.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D99507

Files:
  llvm/lib/Target/AMDGPU/AMDGPU.h
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/CMakeLists.txt
  llvm/lib/Target/AMDGPU/SIAvoidZeroExecMask.cpp
  llvm/test/CodeGen/AMDGPU/control-flow-fastregalloc.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D99507.333866.patch
Type: text/x-patch
Size: 11140 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210329/6f6c4d21/attachment.bin>


More information about the llvm-commits mailing list