[PATCH] D149873: [AMDGPU][GFX908] IndirectCopyToAGPR: Confirm modified register is dst reg of accvgpr_write
Jeffrey Byrnes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 12 09:49:10 PDT 2023
jrbyrnes updated this revision to Diff 521694.
jrbyrnes added a comment.
Remove redundant check + rebase on top of precommit test.
The test shows that indirectCopyToAGPR has incorrectly determined that (at the time of COPY agpr1_agpr2) we can use vgpr0 as having the same bits as agpr1 AND agpr2 (in reality, of the two, it is only consistent with agpr1). We incorrectly make the determination by our assumption of what "definesRegister" means.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149873/new/
https://reviews.llvm.org/D149873
Files:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
Index: llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
+++ llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
@@ -900,9 +900,11 @@
; GFX908-NEXT: $vgpr1 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec, implicit $agpr0_agpr1
; GFX908-NEXT: $agpr2 = V_ACCVGPR_WRITE_B32_e64 $vgpr1, implicit $exec, implicit-def $agpr1_agpr2
; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit $agpr0_agpr1
- ; GFX908-NEXT: $agpr1 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit $exec, implicit-def $agpr1_agpr2
- ; GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit-def $agpr3_agpr4, implicit $agpr1_agpr2
- ; GFX908-NEXT: $agpr3 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit killed $agpr1_agpr2, implicit $exec
+ ; GFX908-NEXT: $agpr1 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr0, implicit $exec, implicit $exec, implicit-def $agpr1_agpr2
+ ; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr1_agpr2
+ ; GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr0, implicit $exec, implicit-def $agpr3_agpr4
+ ; GFX908-NEXT: $vgpr255 = V_ACCVGPR_READ_B32_e64 killed $agpr1, implicit $exec, implicit killed $agpr1_agpr2
+ ; GFX908-NEXT: $agpr3 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr255, implicit $exec, implicit $exec
; GFX90A-LABEL: name: a2_to_a2_implicit_defs
; GFX90A: liveins: $agpr0_agpr1
; GFX90A-NEXT: {{ $}}
@@ -1000,6 +1002,9 @@
S_ENDPGM 0, implicit $agpr0_agpr1_agpr2, implicit $vgpr1
...
+
+
+
---
name: a4_to_a4
tracksRegLiveness: true
Index: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -578,11 +578,20 @@
if (!RegsOverlap) {
for (auto Def = MI, E = MBB.begin(); Def != E; ) {
--Def;
- if (!Def->definesRegister(SrcReg, &RI))
+
+ if (!Def->modifiesRegister(SrcReg, &RI))
continue;
+
if (Def->getOpcode() != AMDGPU::V_ACCVGPR_WRITE_B32_e64)
break;
+ // The 0th operand of ACCVGPR_WRITE on gfx908 will always be the register
+ // that potentially contains the bits we are interested in
+ assert(Def->getOperand(0).isReg());
+
+ if (Def->getOperand(0).getReg() != SrcReg)
+ break;
+
MachineOperand &DefOp = Def->getOperand(1);
assert(DefOp.isReg() || DefOp.isImm());
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D149873.521694.patch
Type: text/x-patch
Size: 2589 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230512/64a097c6/attachment.bin>
More information about the llvm-commits
mailing list