[llvm] [AMDGPU] Consider FLAT instructions for VMEM hazard detection (PR #137170)

Robert Imschweiler via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 24 05:41:14 PDT 2025


https://github.com/ro-i created https://github.com/llvm/llvm-project/pull/137170

In general, "Flat instructions look at the per-workitem address and determine for each work item if the target memory address is in global, private or scratch memory." (RDNA2 ISA) That means that FLAT instructions need to be considered for VMEM hazards even without "specific segment". It should not be needed for DMA VMEM/FLAT instructions, though.

See also #137148

>From d8e9dc672be4c9c83de4aef560e2ae6c6b01483f Mon Sep 17 00:00:00 2001
From: Robert Imschweiler <robert.imschweiler at amd.com>
Date: Thu, 24 Apr 2025 07:35:57 -0500
Subject: [PATCH] [AMDGPU] Consider FLAT instructions for VMEM hazard detection

In general, "Flat instructions look at the per-workitem address and
determine for each work item if the target memory address is in global,
private or scratch memory." (RDNA2 ISA) That means that FLAT
instructions need to be considered for VMEM hazards even without
"specific segment". It should not be needed for DMA VMEM/FLAT
instructions, though.
---
 llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp      | 11 ++++++-----
 llvm/test/CodeGen/AMDGPU/lds-branch-vmem-hazard.mir |  8 ++++++--
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
index aaefe27b1324f..fcafd6a978a4b 100644
--- a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
@@ -1424,9 +1424,9 @@ static bool shouldRunLdsBranchVmemWARHazardFixup(const MachineFunction &MF,
   bool HasVmem = false;
   for (auto &MBB : MF) {
     for (auto &MI : MBB) {
-      HasLds |= SIInstrInfo::isDS(MI);
-      HasVmem |=
-          SIInstrInfo::isVMEM(MI) || SIInstrInfo::isSegmentSpecificFLAT(MI);
+      HasLds |= SIInstrInfo::isDS(MI) || SIInstrInfo::isLDSDMA(MI);
+      HasVmem |= (SIInstrInfo::isVMEM(MI) || SIInstrInfo::isFLAT(MI)) &&
+                 !SIInstrInfo::isLDSDMA(MI);
       if (HasLds && HasVmem)
         return true;
     }
@@ -1448,9 +1448,10 @@ bool GCNHazardRecognizer::fixLdsBranchVmemWARHazard(MachineInstr *MI) {
   assert(!ST.hasExtendedWaitCounts());
 
   auto IsHazardInst = [](const MachineInstr &MI) {
-    if (SIInstrInfo::isDS(MI))
+    if (SIInstrInfo::isDS(MI) || SIInstrInfo::isLDSDMA(MI))
       return 1;
-    if (SIInstrInfo::isVMEM(MI) || SIInstrInfo::isSegmentSpecificFLAT(MI))
+    if ((SIInstrInfo::isVMEM(MI) || SIInstrInfo::isFLAT(MI)) &&
+        !SIInstrInfo::isLDSDMA(MI))
       return 2;
     return 0;
   };
diff --git a/llvm/test/CodeGen/AMDGPU/lds-branch-vmem-hazard.mir b/llvm/test/CodeGen/AMDGPU/lds-branch-vmem-hazard.mir
index 86e657093b5b2..d3ee1c3c128b3 100644
--- a/llvm/test/CodeGen/AMDGPU/lds-branch-vmem-hazard.mir
+++ b/llvm/test/CodeGen/AMDGPU/lds-branch-vmem-hazard.mir
@@ -269,11 +269,15 @@ body:            |
     S_ENDPGM 0
 ...
 
-# GCN-LABEL: name: no_hazard_lds_branch_flat
+# FLAT_* instructions "look at the per-workitem address and determine for each
+# work item if the target memory address is in global, private or scratch
+# memory" (RDNA2 ISA)
+# GCN-LABEL: name: hazard_lds_branch_flat
 # GCN:      bb.1:
+# GFX10-NEXT: S_WAITCNT_VSCNT undef $sgpr_null, 0
 # GCN-NEXT: FLAT_LOAD_DWORD
 ---
-name:            no_hazard_lds_branch_flat
+name:            hazard_lds_branch_flat
 body:            |
   bb.0:
     successors: %bb.1



More information about the llvm-commits mailing list