[PATCH] D101546: [AMDGPU] Fix v_swap_b32 formation on physical registers
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 29 09:55:54 PDT 2021
foad created this revision.
foad added reviewers: rampitec, arsenm.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
foad requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
As explained in the comments, matchSwap matches:
// mov t, x
// mov x, y
// mov y, t
and turns it into:
// mov t, x (t is potentially dead and move eliminated)
// v_swap_b32 x, y
On physical registers we don't have full use-def chains so the check
for T being live-out was not working properly with subregs/superregs.
Instead we have to look for uses of T and rely on kill flags.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D101546
Files:
llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
Index: llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
+++ llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
@@ -46,13 +46,13 @@
liveins: $vgpr0, $vgpr1, $sgpr30_sgpr31
$vgpr2 = V_MOV_B32_e32 killed $vgpr0, implicit $exec
$vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec
- $vgpr1 = V_MOV_B32_e32 killed $vgpr2, implicit $exec
+ $vgpr1 = V_MOV_B32_e32 $vgpr2, implicit $exec
S_SETPC_B64_return $sgpr30_sgpr31, implicit $vgpr0, implicit $vgpr2, implicit $vgpr1
...
-# FIXME: should not remove the def of $vgpr2 because $vgpr2_vgpr3 is live out
# GCN-LABEL: name: swap_phys_liveout_superreg
# GCN: bb.0:
+# GCN-NEXT: $vgpr2 = V_MOV_B32_e32 $vgpr0, implicit $exec
# GCN-NEXT: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
# GCN-NEXT: S_SETPC_B64_return
---
@@ -62,7 +62,7 @@
liveins: $vgpr0, $vgpr1, $sgpr30_sgpr31
$vgpr2 = V_MOV_B32_e32 killed $vgpr0, implicit $exec
$vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec
- $vgpr1 = V_MOV_B32_e32 killed $vgpr2, implicit $exec
+ $vgpr1 = V_MOV_B32_e32 $vgpr2, implicit $exec
S_SETPC_B64_return $sgpr30_sgpr31, implicit $vgpr0, implicit $vgpr2_vgpr3, implicit $vgpr1
...
Index: llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+++ llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
@@ -491,6 +491,7 @@
const unsigned SearchLimit = 16;
unsigned Count = 0;
bool KilledT = false;
+ bool UsedT = false;
for (auto Iter = std::next(MovT.getIterator()),
E = MovT.getParent()->instr_end();
Iter != E && Count < SearchLimit && !KilledT; ++Iter, ++Count) {
@@ -498,6 +499,9 @@
MachineInstr *MovY = &*Iter;
KilledT = MovY->killsRegister(T, &TRI);
+ bool PrevUsedT = UsedT;
+ UsedT = UsedT || instReadsReg(MovY, T, Tsub, TRI);
+
if ((MovY->getOpcode() != AMDGPU::V_MOV_B32_e32 &&
MovY->getOpcode() != AMDGPU::COPY) ||
!MovY->getOperand(1).isReg() ||
@@ -573,7 +577,10 @@
dropInstructionKeepingImpDefs(*MovY, TII);
MachineInstr *Next = &*std::next(MovT.getIterator());
- if (MRI.use_nodbg_empty(T)) {
+ if (T.isVirtual() ? MRI.use_nodbg_empty(T)
+ : (MovY->getOperand(1).isKill() && !PrevUsedT)) {
+ // For physical registers: the use of T in MovY is killed, and we didn't
+ // see any other uses of T before MovY.
dropInstructionKeepingImpDefs(MovT, TII);
} else {
Xop.setIsKill(false);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D101546.341556.patch
Type: text/x-patch
Size: 2660 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210429/3a4788bf/attachment.bin>
More information about the llvm-commits
mailing list