[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 01:37:04 PST 2022


tsymalla created this revision.
tsymalla added a reviewer: foad.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
tsymalla requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

On GFX10.3 targets, the following instruction sequence

v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR

leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.

An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.

This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.

It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D119696

Files:
  llvm/lib/Target/AMDGPU/SIInstrInfo.h
  llvm/lib/Target/AMDGPU/SIInstrInfo.td
  llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
  llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
  llvm/lib/Target/AMDGPU/VOPCInstructions.td
  llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-gfx1030.ll
  llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.ll
  llvm/test/CodeGen/AMDGPU/wqm.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D119696.408354.patch
Type: text/x-patch
Size: 25054 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220214/e2e27bd9/attachment-0001.bin>


More information about the llvm-commits mailing list