[PATCH] D118975: [AMDGPU] Allow hoisting of some VALU compare instructions
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 4 01:18:36 PST 2022
critson created this revision.
critson added reviewers: foad, rampitec, vangthao, sebastian-ne.
Herald added subscribers: kerbowa, asbirlea, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
critson requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
Conversatively allow hoisting/sinking of VALU comparisons.
If the result of a comparison is masked with exec, narrowing the
set of active lanes, then it is safe to hoist it as the masking
instruction will never by hoisted.
Heuristically this is also true for sinking, as we do not expect
the result of a sunk comparison that is masked with exec to be
used outside of the loop.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D118975
Files:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/test/CodeGen/AMDGPU/licm-valu.mir
Index: llvm/test/CodeGen/AMDGPU/licm-valu.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/licm-valu.mir
+++ llvm/test/CodeGen/AMDGPU/licm-valu.mir
@@ -47,7 +47,7 @@
; GCN-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
; GCN-NEXT: {{ $}}
; GCN-NEXT: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_64 = V_CMP_EQ_U32_e64 1, 2, implicit $exec
- ; GCN-NEXT: $exec = S_OR_B64 $exec, 1, implicit-def $scc
+ ; GCN-NEXT: $exec = S_OR_B64 $exec, [[V_CMP_EQ_U32_e64_]], implicit-def $scc
; GCN-NEXT: S_CBRANCH_EXECNZ %bb.1, implicit $exec
; GCN-NEXT: S_BRANCH %bb.2
; GCN-NEXT: {{ $}}
@@ -58,7 +58,39 @@
bb.1:
%0:sreg_64 = V_CMP_EQ_U32_e64 1, 2, implicit $exec
- $exec = S_OR_B64 $exec, 1, implicit-def $scc
+ $exec = S_OR_B64 $exec, %0:sreg_64, implicit-def $scc
+ S_CBRANCH_EXECNZ %bb.1, implicit $exec
+ S_BRANCH %bb.2
+
+ bb.2:
+ S_ENDPGM 0
+...
+---
+name: allowable_hoist_cmp
+tracksRegLiveness: true
+body: |
+ ; GCN-LABEL: name: allowable_hoist_cmp
+ ; GCN: bb.0:
+ ; GCN-NEXT: successors: %bb.1(0x80000000)
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_64 = V_CMP_EQ_U32_e64 1, 2, implicit $exec
+ ; GCN-NEXT: S_BRANCH %bb.1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.1:
+ ; GCN-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: $exec = S_AND_B64 $exec, [[V_CMP_EQ_U32_e64_]], implicit-def $scc
+ ; GCN-NEXT: S_CBRANCH_EXECNZ %bb.1, implicit $exec
+ ; GCN-NEXT: S_BRANCH %bb.2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.2:
+ ; GCN-NEXT: S_ENDPGM 0
+ bb.0:
+ S_BRANCH %bb.1
+
+ bb.1:
+ %0:sreg_64 = V_CMP_EQ_U32_e64 1, 2, implicit $exec
+ $exec = S_AND_B64 $exec, %0:sreg_64, implicit-def $scc
S_CBRANCH_EXECNZ %bb.1, implicit $exec
S_BRANCH %bb.2
Index: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -131,8 +131,25 @@
}
static bool readsExecAsData(const MachineInstr &MI) {
- if (MI.isCompare())
- return true;
+ if (MI.isCompare()) {
+ const MachineRegisterInfo &MRI = MI.getParent()->getParent()->getRegInfo();
+ Register DstReg = MI.getOperand(0).getReg();
+ for (MachineInstr &Use : MRI.use_instructions(DstReg)) {
+ switch (Use.getOpcode()) {
+ case AMDGPU::S_AND_SAVEEXEC_B32:
+ case AMDGPU::S_AND_SAVEEXEC_B64:
+ break;
+ case AMDGPU::S_AND_B32:
+ case AMDGPU::S_AND_B64:
+ if (!Use.readsRegister(AMDGPU::EXEC))
+ return true;
+ break;
+ default:
+ return true;
+ }
+ }
+ return false;
+ }
switch (MI.getOpcode()) {
default:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D118975.405889.patch
Type: text/x-patch
Size: 2819 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220204/e4c60c55/attachment.bin>
More information about the llvm-commits
mailing list