[PATCH] D149873: [AMDGPU][GFX908] IndirectCopyToAGPR: Confirm modified register is dst reg of accvgpr_write

Jeffrey Byrnes via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 12 09:49:10 PDT 2023


jrbyrnes updated this revision to Diff 521694.
jrbyrnes added a comment.

Remove redundant check + rebase on top of precommit test.

The test shows that indirectCopyToAGPR has incorrectly determined that (at the time of COPY agpr1_agpr2) we can use vgpr0 as having the same bits as agpr1 AND agpr2 (in reality, of the two, it is only consistent with agpr1). We incorrectly make the determination by our assumption of what "definesRegister" means.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149873/new/

https://reviews.llvm.org/D149873

Files:
  llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
  llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir


Index: llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
+++ llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir
@@ -900,9 +900,11 @@
     ; GFX908-NEXT: $vgpr1 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec, implicit $agpr0_agpr1
     ; GFX908-NEXT: $agpr2 = V_ACCVGPR_WRITE_B32_e64 $vgpr1, implicit $exec, implicit-def $agpr1_agpr2
     ; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit $agpr0_agpr1
-    ; GFX908-NEXT: $agpr1 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit $exec, implicit-def $agpr1_agpr2
-    ; GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit-def $agpr3_agpr4, implicit $agpr1_agpr2
-    ; GFX908-NEXT: $agpr3 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit killed $agpr1_agpr2, implicit $exec
+    ; GFX908-NEXT: $agpr1 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr0, implicit $exec, implicit $exec, implicit-def $agpr1_agpr2
+    ; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr1_agpr2
+    ; GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr0, implicit $exec, implicit-def $agpr3_agpr4
+    ; GFX908-NEXT: $vgpr255 = V_ACCVGPR_READ_B32_e64 killed $agpr1, implicit $exec, implicit killed $agpr1_agpr2
+    ; GFX908-NEXT: $agpr3 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr255, implicit $exec, implicit $exec
     ; GFX90A-LABEL: name: a2_to_a2_implicit_defs
     ; GFX90A: liveins: $agpr0_agpr1
     ; GFX90A-NEXT: {{  $}}
@@ -1000,6 +1002,9 @@
     S_ENDPGM 0, implicit $agpr0_agpr1_agpr2, implicit $vgpr1
 ...
 
+
+
+
 ---
 name:            a4_to_a4
 tracksRegLiveness: true
Index: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -578,11 +578,20 @@
   if (!RegsOverlap) {
     for (auto Def = MI, E = MBB.begin(); Def != E; ) {
       --Def;
-      if (!Def->definesRegister(SrcReg, &RI))
+
+      if (!Def->modifiesRegister(SrcReg, &RI))
         continue;
+
       if (Def->getOpcode() != AMDGPU::V_ACCVGPR_WRITE_B32_e64)
         break;
 
+      // The 0th operand of ACCVGPR_WRITE on gfx908 will always be the register
+      // that potentially contains the bits we are interested in
+      assert(Def->getOperand(0).isReg());
+
+      if (Def->getOperand(0).getReg() != SrcReg)
+        break;
+
       MachineOperand &DefOp = Def->getOperand(1);
       assert(DefOp.isReg() || DefOp.isImm());
 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D149873.521694.patch
Type: text/x-patch
Size: 2589 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230512/64a097c6/attachment.bin>


More information about the llvm-commits mailing list