[llvm-branch-commits] [llvm] [AMDGPU][SIMemoryLegalizer] Consider scratch operations as NV=1 if GAS is disabled (PR #189556)

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Tue Mar 31 01:05:11 PDT 2026


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-backend-amdgpu

Author: Pierre van Houtryve (Pierre-vh)

<details>
<summary>Changes</summary>

- Clarify that `thread-private` MMO flag is still useful.
- If GAS is not enabled (which is the default as of last patch), consider an op as `NV=1` if it's a `scratch_` opcode, or if the MMO is in the private AS.
- Add tests for the new cases.
- Update AMDGPUUsage GFX12.5 memory model

---

Patch is 25.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/189556.diff


5 Files Affected:

- (modified) llvm/docs/AMDGPUUsage.rst (+13-6) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+3-1) 
- (modified) llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp (+14-3) 
- (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.ll (+75-36) 
- (added) llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.mir (+181) 


``````````diff
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 388e1b33fe12d..a380499edecfd 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -17350,8 +17350,7 @@ For GFX125x:
 
   This section is currently incomplete as work on the compiler is still ongoing.
   The following is a non-exhaustive list of unimplemented/undocumented features:
-  non-volatile bit code sequences, multicast loads, barriers (including split barriers)
-  and cooperative atomics.
+  multicast loads, barriers (including split barriers) and cooperative atomics.
   Scalar operations memory model needs more elaboration as well.
 
 * Vector memory operations are performed as wavefront wide operations, with the
@@ -17429,6 +17428,10 @@ For GFX125x:
   * When ``nv=0`` reads hit dirty ``$nv=1`` data in cache, the hardware will
     writeback the data to the next level in the hierarchy and then subsequently read
     it again, updating the cache line with a clean ``$nv=0`` copy of the data.
+  * ``nv=1`` is set on operations that are known to access read-only memory, or memory
+    that can only be modified by the current thread. For example, all scratch/private memory
+    (if ``globally-addressable-scratch`` is disabled), scratch memory used for spill/reloads,
+    loads marked as invariant, etc.
 
 * ``global_inv``, ``global_wb`` and ``global_wbinv`` are cache control instructions.
   The affected cache(s) are controlled by the ``SCOPE`` of the instruction.
@@ -17473,10 +17476,14 @@ may change between kernel dispatch executions. See
 
 Atomics in the scratch address space are handled as follows:
 
-* Data types <= 32 bits: The instruction is converted into an atomic in the
-  generic (``flat``) address space. All properties of the atomic
-  (atomic ordering, volatility, alignment, etc.) are preserved.
-  Refer to the generic address space code sequences for further information.
+* Data types <= 32 bits:
+
+  * If ``globally-addressable-scratch`` is used, the instruction is converted into
+    an atomic in the generic (``flat``) address space.  All properties of the atomic
+    (atomic ordering, volatility, alignment, etc.) are preserved.
+    Refer to the generic address space code sequences for further information.
+  * Otherwise, the operation is considered as non-atomic.
+
 * Data types >32 bits: unsupported and an error is emitted.
 
 The code sequences used to implement the memory model for GFX125x are defined in
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index cc0b0408bc09c..3f0ae14b84a02 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -53,7 +53,9 @@ static const MachineMemOperand::Flags MOCooperative =
     MachineMemOperand::MOTargetFlag3;
 
 /// Mark the MMO of accesses to memory locations that are
-/// never written to by other threads.
+/// known to never written to by other threads, no matter the
+/// target and target features enabled
+/// (e.g. even with globally-addressable scratch enabled).
 static const MachineMemOperand::Flags MOThreadPrivate =
     MachineMemOperand::MOTargetFlag4;
 
diff --git a/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp b/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
index a7dcc6d8aa849..7990ac845e45a 100644
--- a/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
+++ b/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
@@ -909,11 +909,22 @@ SIMemOpAccess::getLDSDMAInfo(const MachineBasicBlock::iterator &MI) const {
 /// being marked as non-volatile. This means that either they are accessing the
 /// constant address space, are accessing a known invariant memory location, or
 /// that they are marked with the non-volatile metadata/MMO flag.
-static bool isNonVolatileMemoryAccess(const MachineInstr &MI) {
+static bool isNonVolatileMemoryAccess(const GCNSubtarget &ST,
+                                      const MachineInstr &MI) {
   if (MI.getNumMemOperands() == 0)
     return false;
+  // If globally addressable scratch is not in use, we can assume any scratch
+  // opcode accesses thread-local memory, thus is NV=1.
+  bool GASEnabled = ST.isGloballyAddressableScratchEnabled();
+  if (!GASEnabled && ST.getInstrInfo()->isFLATScratch(MI.getOpcode()))
+    return true;
   return all_of(MI.memoperands(), [&](const MachineMemOperand *MMO) {
-    return MMO->getFlags() & (MOThreadPrivate | MachineMemOperand::MOInvariant);
+    // If globally addressable scratch is enabled, we can only set NV=1 by
+    // checking for the thread-private or invariant memory. If it is disabled,
+    // we can additionally consider private memory.
+    return (!GASEnabled && MMO->getAddrSpace() == AMDGPUAS::PRIVATE_ADDRESS) ||
+           (MMO->getFlags() &
+            (MOThreadPrivate | MachineMemOperand::MOInvariant));
   });
 }
 
@@ -2497,7 +2508,7 @@ bool SIMemoryLegalizer::run(MachineFunction &MF) {
           Changed |= expandAtomicCmpxchgOrRmw(*MOI, MI);
       }
 
-      if (isNonVolatileMemoryAccess(*MI))
+      if (isNonVolatileMemoryAccess(ST, *MI))
         Changed |= CC->handleNonVolatile(*MI);
     }
   }
diff --git a/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.ll b/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.ll
index ab12e3c19992d..5e3680f18ad7c 100644
--- a/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.ll
+++ b/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.ll
@@ -2,8 +2,10 @@
 ; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -mattr=+cumode < %s | FileCheck --check-prefixes=GFX12-CU,GFX12-CU-DAGISEL %s
 ; RUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -mattr=+cumode < %s | FileCheck --check-prefixes=GFX12-CU,GFX12-CU-GISEL %s
 
-; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 < %s | FileCheck --check-prefixes=GFX1250,GFX1250-DAGISEL %s
-; RUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 < %s | FileCheck --check-prefixes=GFX1250,GFX1250-GISEL %s
+; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mattr=+globally-addressable-scratch -mcpu=gfx1250 < %s | FileCheck --check-prefixes=GFX1250,GFX1250-GAS,GFX1250-GAS-DAGISEL %s
+; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 < %s                                      | FileCheck --check-prefixes=GFX1250,GFX1250-NOGAS,GFX1250-NOGAS-DAGISEL %s
+; RUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mattr=+globally-addressable-scratch -mcpu=gfx1250 < %s | FileCheck --check-prefixes=GFX1250,GFX1250-GAS,GFX1250-GAS-GISEL %s
+; RUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 < %s                                      | FileCheck --check-prefixes=GFX1250,GFX1250-NOGAS,GFX1250-NOGAS-GISEL %s
 
 define void @flat_i32_nonatomic(ptr addrspace(0) %in, ptr addrspace(0) %out) {
 ; GFX12-CU-LABEL: flat_i32_nonatomic:
@@ -174,14 +176,23 @@ define void @scratch_i32_nonatomic(ptr addrspace(5) %in, ptr addrspace(5) %out)
 ; GFX12-CU-NEXT:    scratch_store_b32 v1, v0, off
 ; GFX12-CU-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX1250-LABEL: scratch_i32_nonatomic:
-; GFX1250:       ; %bb.0: ; %entry
-; GFX1250-NEXT:    s_wait_loadcnt_dscnt 0x0
-; GFX1250-NEXT:    s_wait_kmcnt 0x0
-; GFX1250-NEXT:    scratch_load_b32 v0, v0, off
-; GFX1250-NEXT:    s_wait_loadcnt 0x0
-; GFX1250-NEXT:    scratch_store_b32 v1, v0, off
-; GFX1250-NEXT:    s_set_pc_i64 s[30:31]
+; GFX1250-GAS-LABEL: scratch_i32_nonatomic:
+; GFX1250-GAS:       ; %bb.0: ; %entry
+; GFX1250-GAS-NEXT:    s_wait_loadcnt_dscnt 0x0
+; GFX1250-GAS-NEXT:    s_wait_kmcnt 0x0
+; GFX1250-GAS-NEXT:    scratch_load_b32 v0, v0, off
+; GFX1250-GAS-NEXT:    s_wait_loadcnt 0x0
+; GFX1250-GAS-NEXT:    scratch_store_b32 v1, v0, off
+; GFX1250-GAS-NEXT:    s_set_pc_i64 s[30:31]
+;
+; GFX1250-NOGAS-LABEL: scratch_i32_nonatomic:
+; GFX1250-NOGAS:       ; %bb.0: ; %entry
+; GFX1250-NOGAS-NEXT:    s_wait_loadcnt_dscnt 0x0
+; GFX1250-NOGAS-NEXT:    s_wait_kmcnt 0x0
+; GFX1250-NOGAS-NEXT:    scratch_load_b32 v0, v0, off nv
+; GFX1250-NOGAS-NEXT:    s_wait_loadcnt 0x0
+; GFX1250-NOGAS-NEXT:    scratch_store_b32 v1, v0, off nv
+; GFX1250-NOGAS-NEXT:    s_set_pc_i64 s[30:31]
 entry:
   %val = load i32, ptr addrspace(5) %in
   store i32 %val, ptr addrspace(5) %out
@@ -303,33 +314,61 @@ define void @buffer_i32_nonatomic(ptr addrspace(7) inreg %in, ptr addrspace(7) i
 ; GFX12-CU-GISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
 ; GFX12-CU-GISEL-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX1250-DAGISEL-LABEL: buffer_i32_nonatomic:
-; GFX1250-DAGISEL:       ; %bb.0: ; %entry
-; GFX1250-DAGISEL-NEXT:    s_wait_loadcnt_dscnt 0x0
-; GFX1250-DAGISEL-NEXT:    s_wait_kmcnt 0x0
-; GFX1250-DAGISEL-NEXT:    v_dual_mov_b32 v0, s16 :: v_dual_mov_b32 v1, s21
-; GFX1250-DAGISEL-NEXT:    s_mov_b32 s7, s20
-; GFX1250-DAGISEL-NEXT:    s_mov_b32 s6, s19
-; GFX1250-DAGISEL-NEXT:    s_mov_b32 s5, s18
-; GFX1250-DAGISEL-NEXT:    buffer_load_b32 v0, v0, s[0:3], null offen
-; GFX1250-DAGISEL-NEXT:    s_mov_b32 s4, s17
-; GFX1250-DAGISEL-NEXT:    s_wait_loadcnt 0x0
-; GFX1250-DAGISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
-; GFX1250-DAGISEL-NEXT:    s_set_pc_i64 s[30:31]
+; GFX1250-GAS-DAGISEL-LABEL: buffer_i32_nonatomic:
+; GFX1250-GAS-DAGISEL:       ; %bb.0: ; %entry
+; GFX1250-GAS-DAGISEL-NEXT:    s_wait_loadcnt_dscnt 0x0
+; GFX1250-GAS-DAGISEL-NEXT:    s_wait_kmcnt 0x0
+; GFX1250-GAS-DAGISEL-NEXT:    v_dual_mov_b32 v0, s16 :: v_dual_mov_b32 v1, s21
+; GFX1250-GAS-DAGISEL-NEXT:    s_mov_b32 s7, s20
+; GFX1250-GAS-DAGISEL-NEXT:    s_mov_b32 s6, s19
+; GFX1250-GAS-DAGISEL-NEXT:    s_mov_b32 s5, s18
+; GFX1250-GAS-DAGISEL-NEXT:    buffer_load_b32 v0, v0, s[0:3], null offen
+; GFX1250-GAS-DAGISEL-NEXT:    s_mov_b32 s4, s17
+; GFX1250-GAS-DAGISEL-NEXT:    s_wait_loadcnt 0x0
+; GFX1250-GAS-DAGISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
+; GFX1250-GAS-DAGISEL-NEXT:    s_set_pc_i64 s[30:31]
+;
+; GFX1250-NOGAS-DAGISEL-LABEL: buffer_i32_nonatomic:
+; GFX1250-NOGAS-DAGISEL:       ; %bb.0: ; %entry
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_wait_loadcnt_dscnt 0x0
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_wait_kmcnt 0x0
+; GFX1250-NOGAS-DAGISEL-NEXT:    v_dual_mov_b32 v0, s16 :: v_dual_mov_b32 v1, s21
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_mov_b32 s7, s20
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_mov_b32 s6, s19
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_mov_b32 s5, s18
+; GFX1250-NOGAS-DAGISEL-NEXT:    buffer_load_b32 v0, v0, s[0:3], null offen
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_mov_b32 s4, s17
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_wait_loadcnt 0x0
+; GFX1250-NOGAS-DAGISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
+; GFX1250-NOGAS-DAGISEL-NEXT:    s_set_pc_i64 s[30:31]
+;
+; GFX1250-GAS-GISEL-LABEL: buffer_i32_nonatomic:
+; GFX1250-GAS-GISEL:       ; %bb.0: ; %entry
+; GFX1250-GAS-GISEL-NEXT:    s_wait_loadcnt_dscnt 0x0
+; GFX1250-GAS-GISEL-NEXT:    s_wait_kmcnt 0x0
+; GFX1250-GAS-GISEL-NEXT:    v_dual_mov_b32 v0, s16 :: v_dual_mov_b32 v1, s21
+; GFX1250-GAS-GISEL-NEXT:    s_mov_b32 s4, s17
+; GFX1250-GAS-GISEL-NEXT:    s_mov_b32 s5, s18
+; GFX1250-GAS-GISEL-NEXT:    s_mov_b32 s6, s19
+; GFX1250-GAS-GISEL-NEXT:    buffer_load_b32 v0, v0, s[0:3], null offen
+; GFX1250-GAS-GISEL-NEXT:    s_mov_b32 s7, s20
+; GFX1250-GAS-GISEL-NEXT:    s_wait_loadcnt 0x0
+; GFX1250-GAS-GISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
+; GFX1250-GAS-GISEL-NEXT:    s_set_pc_i64 s[30:31]
 ;
-; GFX1250-GISEL-LABEL: buffer_i32_nonatomic:
-; GFX1250-GISEL:       ; %bb.0: ; %entry
-; GFX1250-GISEL-NEXT:    s_wait_loadcnt_dscnt 0x0
-; GFX1250-GISEL-NEXT:    s_wait_kmcnt 0x0
-; GFX1250-GISEL-NEXT:    v_dual_mov_b32 v0, s16 :: v_dual_mov_b32 v1, s21
-; GFX1250-GISEL-NEXT:    s_mov_b32 s4, s17
-; GFX1250-GISEL-NEXT:    s_mov_b32 s5, s18
-; GFX1250-GISEL-NEXT:    s_mov_b32 s6, s19
-; GFX1250-GISEL-NEXT:    buffer_load_b32 v0, v0, s[0:3], null offen
-; GFX1250-GISEL-NEXT:    s_mov_b32 s7, s20
-; GFX1250-GISEL-NEXT:    s_wait_loadcnt 0x0
-; GFX1250-GISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
-; GFX1250-GISEL-NEXT:    s_set_pc_i64 s[30:31]
+; GFX1250-NOGAS-GISEL-LABEL: buffer_i32_nonatomic:
+; GFX1250-NOGAS-GISEL:       ; %bb.0: ; %entry
+; GFX1250-NOGAS-GISEL-NEXT:    s_wait_loadcnt_dscnt 0x0
+; GFX1250-NOGAS-GISEL-NEXT:    s_wait_kmcnt 0x0
+; GFX1250-NOGAS-GISEL-NEXT:    v_dual_mov_b32 v0, s16 :: v_dual_mov_b32 v1, s21
+; GFX1250-NOGAS-GISEL-NEXT:    s_mov_b32 s4, s17
+; GFX1250-NOGAS-GISEL-NEXT:    s_mov_b32 s5, s18
+; GFX1250-NOGAS-GISEL-NEXT:    s_mov_b32 s6, s19
+; GFX1250-NOGAS-GISEL-NEXT:    buffer_load_b32 v0, v0, s[0:3], null offen
+; GFX1250-NOGAS-GISEL-NEXT:    s_mov_b32 s7, s20
+; GFX1250-NOGAS-GISEL-NEXT:    s_wait_loadcnt 0x0
+; GFX1250-NOGAS-GISEL-NEXT:    buffer_store_b32 v0, v1, s[4:7], null offen
+; GFX1250-NOGAS-GISEL-NEXT:    s_set_pc_i64 s[30:31]
 entry:
   %val = load i32, ptr addrspace(7) %in
   store i32 %val, ptr addrspace(7) %out
diff --git a/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.mir b/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.mir
new file mode 100644
index 0000000000000..ddd234f7e30dc
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.mir
@@ -0,0 +1,181 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 6
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -run-pass si-memory-legalizer  %s -o -                                      | FileCheck --check-prefixes=GFX1200
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 -mattr=+globally-addressable-scratch -run-pass si-memory-legalizer  %s -o - | FileCheck --check-prefixes=GFX1250,GFX1250-GAS
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1250 -run-pass si-memory-legalizer  %s -o -                                      | FileCheck --check-prefixes=GFX1250,GFX1250-NOGAS
+
+# Additional test cases to accompany the LLVM IR tests.
+# NOTE: NV bit = 5th bit set on CPol operand (values >=32).
+
+---
+
+name:            private_through_flat_inst
+body:             |
+  bb.0:
+    liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+
+    ; GFX1200-LABEL: name: private_through_flat_inst
+    ; GFX1200: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1200-NEXT: {{  $}}
+    ; GFX1200-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1200-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1200-NEXT: S_ENDPGM 0
+    ;
+    ; GFX1250-GAS-LABEL: name: private_through_flat_inst
+    ; GFX1250-GAS: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1250-GAS-NEXT: {{  $}}
+    ; GFX1250-GAS-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-GAS-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-GAS-NEXT: S_ENDPGM 0
+    ;
+    ; GFX1250-NOGAS-LABEL: name: private_through_flat_inst
+    ; GFX1250-NOGAS: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1250-NOGAS-NEXT: {{  $}}
+    ; GFX1250-NOGAS-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 32, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-NOGAS-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 32, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-NOGAS-NEXT: S_ENDPGM 0
+    renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`)
+    FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`)
+    S_ENDPGM 0
+...
+
+---
+
+name:            private_and_generic_mmo
+body:             |
+  bb.0:
+    liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+
+    ; GFX1200-LABEL: name: private_and_generic_mmo
+    ; GFX1200: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1200-NEXT: {{  $}}
+    ; GFX1200-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5), (load (s32) from `ptr poison`)
+    ; GFX1200-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5), (store (s32) into `ptr poison`)
+    ; GFX1200-NEXT: S_ENDPGM 0
+    ;
+    ; GFX1250-LABEL: name: private_and_generic_mmo
+    ; GFX1250: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1250-NEXT: {{  $}}
+    ; GFX1250-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5), (load (s32) from `ptr poison`)
+    ; GFX1250-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5), (store (s32) into `ptr poison`)
+    ; GFX1250-NEXT: S_ENDPGM 0
+    renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`), (load (s32) from `ptr poison`)
+    FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`), (store (s32) into `ptr poison`)
+    S_ENDPGM 0
+...
+
+---
+
+name:            multiple_private_mmos
+body:             |
+  bb.0:
+    liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+
+    ; GFX1200-LABEL: name: multiple_private_mmos
+    ; GFX1200: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1200-NEXT: {{  $}}
+    ; GFX1200-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5), (load (s32) from `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1200-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5), (store (s32) into `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1200-NEXT: S_ENDPGM 0
+    ;
+    ; GFX1250-GAS-LABEL: name: multiple_private_mmos
+    ; GFX1250-GAS: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1250-GAS-NEXT: {{  $}}
+    ; GFX1250-GAS-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5), (load (s32) from `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-GAS-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5), (store (s32) into `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-GAS-NEXT: S_ENDPGM 0
+    ;
+    ; GFX1250-NOGAS-LABEL: name: multiple_private_mmos
+    ; GFX1250-NOGAS: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+    ; GFX1250-NOGAS-NEXT: {{  $}}
+    ; GFX1250-NOGAS-NEXT: renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 32, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`, addrspace 5), (load (s32) from `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-NOGAS-NEXT: FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 32, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`, addrspace 5), (store (s32) into `ptr addrspace(5) poison`, addrspace 5)
+    ; GFX1250-NOGAS-NEXT: S_ENDPGM 0
+    renamable $vgpr2 = FLAT_LOAD_DWORD killed renamable $vgpr0_vgpr1, 0, 0, implicit $exec, implicit $flat_scr :: (load (s32) from `ptr addrspace(5) poison`), (load (s32) from `ptr addrspace(5) poison`)
+    FLAT_STORE_DWORD killed renamable $vgpr0_vgpr1, killed renamable $vgpr2, 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into `ptr addrspace(5) poison`), (store (s32) into `ptr addrspace(5) po...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/189556


More information about the llvm-branch-commits mailing list