[PATCH] D130313: [AMDGPU] Avoid flushing the vmcnt counter in loop preheaders if not necessary

Baptiste Saleil via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 21 16:11:39 PDT 2022


bsaleil created this revision.
bsaleil added reviewers: foad, nhaehnle, AMDGPU.
bsaleil added projects: LLVM, AMDGPU.
Herald added subscribers: kosarev, jsilvanus, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, jvesely, kzhuravl, arsenm.
Herald added a project: All.
bsaleil requested review of this revision.
Herald added subscribers: llvm-commits, wdng.

One of the conditions to flush the vmcnt counter in loop preheaders is: The loop contains a use of a vgpr that is defined out of the loop.
The code currently checks if a waitcnt is needed by looking at the score of that vgpr in the score brackets. This is not enough and may cause the generation of an unnecessary vmcnt flush. This patch fixed that case.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D130313

Files:
  llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
  llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
  llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir


Index: llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir
+++ llvm/test/CodeGen/AMDGPU/waitcnt-vmcnt-loop.mir
@@ -535,3 +535,40 @@
     S_ENDPGM 0
 
 ...
+---
+
+# This test case checks that we flush the vmcnt counter only if necessary
+# (i.e. if a waitcnt is needed for the vgpr use we find in the loop)
+
+# GFX10-LABEL: waitcnt_vm_necessary
+# GFX10-LABEL: bb.0:
+# GFX10: S_WAITCNT 16240
+# GFX10: renamable $vgpr4
+# GFX10-NOT: S_WAITCNT 16240
+# GFX10-LABEL: bb.1:
+# GFX10-NOT: S_WAITCNT 16240
+
+# GFX9-LABEL: waitcnt_vm_necessary
+# GFX9-LABEL: bb.0:
+# GFX9: S_WAITCNT 3952
+# GFX9: renamable $vgpr4
+# GFX9-NOT: S_WAITCNT 3952
+# GFX9-LABEL: bb.1:
+# GFX9-NOT: S_WAITCNT 3952
+
+name:            waitcnt_vm_necessary
+body:             |
+  bb.0:
+    successors: %bb.1(0x80000000)
+
+    renamable $vgpr0_vgpr1_vgpr2_vgpr3 = GLOBAL_LOAD_DWORDX4 killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec
+    renamable $vgpr4 = BUFFER_LOAD_DWORD_OFFEN undef renamable $vgpr0, undef renamable $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, 0, implicit $exec
+
+  bb.1:
+    successors: %bb.1(0x40000000)
+
+    renamable $vgpr5 = BUFFER_LOAD_DWORD_OFFEN undef renamable $vgpr0, renamable $sgpr4_sgpr5_sgpr6_sgpr7, 0, 0, 0, 0, 0, implicit $exec
+    S_CBRANCH_SCC1 %bb.1, implicit killed $scc
+    S_ENDPGM 0
+
+...
Index: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
===================================================================
--- llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+++ llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
@@ -569,6 +569,10 @@
     return VsCnt != ~0u;
   }
 
+  bool hasWaitVmCnt() const {
+    return VmCnt != ~0u;
+  }
+
   bool dominates(const Waitcnt &Other) const {
     return VmCnt <= Other.VmCnt && ExpCnt <= Other.ExpCnt &&
            LgkmCnt <= Other.LgkmCnt && VsCnt <= Other.VsCnt;
Index: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -1739,7 +1739,10 @@
             VgprUse.insert(RegNo);
             // If at least one of Op's registers is in the score brackets, the
             // value is likely loaded outside of the loop.
-            if (Brackets.getRegScore(RegNo, VM_CNT) > 0) {
+            unsigned Score = Brackets.getRegScore(RegNo, VM_CNT);
+            AMDGPU::Waitcnt Wait;
+            Brackets.determineWait(VM_CNT, Score, Wait);
+            if (Wait.hasWaitVmCnt()) {
               UsesVgprLoadedOutside = true;
               break;
             }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D130313.446654.patch
Type: text/x-patch
Size: 2695 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220721/2918dfd4/attachment.bin>


More information about the llvm-commits mailing list