[all-commits] [llvm/llvm-project] 011c64: [AMDGPU] Improve v_cmpx usage on GFX10.3.
Thomas Symalla via All-commits
all-commits at lists.llvm.org
Mon Mar 21 01:32:12 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 011c64191ef9ccc6538d52f4b57f98f37d4ea36e
https://github.com/llvm/llvm-project/commit/011c64191ef9ccc6538d52f4b57f98f37d4ea36e
Author: Thomas Symalla <thomas.symalla at amd.com>
Date: 2022-03-21 (Mon, 21 Mar 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
M llvm/lib/Target/AMDGPU/SIInstrInfo.h
M llvm/lib/Target/AMDGPU/SIInstrInfo.td
M llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
M llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
M llvm/lib/Target/AMDGPU/VOPCInstructions.td
M llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx10-branch-offset-bug.ll
A llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.ll
A llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.mir
M llvm/test/CodeGen/AMDGPU/wqm.ll
Log Message:
-----------
[AMDGPU] Improve v_cmpx usage on GFX10.3.
On GFX10.3 targets, the following instruction sequence
v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR
leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.
An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.
This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.
It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.
Reviewed By: sebastian-ne, critson
Differential Revision: https://reviews.llvm.org/D119696
More information about the All-commits
mailing list