[llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 17 03:01:59 PST 2025


https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/123234

>From 676cfcbf4b131436e4bdf47a6afa604e26d8936a Mon Sep 17 00:00:00 2001
From: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: Thu, 16 Jan 2025 11:55:37 -0800
Subject: [PATCH 1/2] [AMDGPU] Fix printing hasInitWholeWave in mir

---
 .../lib/Target/AMDGPU/SIMachineFunctionInfo.cpp |  2 +-
 llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll | 17 +++++++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll

diff --git a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
index 169f1369fb5433..7de64bddf78846 100644
--- a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
@@ -715,7 +715,7 @@ yaml::SIMachineFunctionInfo::SIMachineFunctionInfo(
       ArgInfo(convertArgumentInfo(MFI.getArgInfo(), TRI)),
       PSInputAddr(MFI.getPSInputAddr()), PSInputEnable(MFI.getPSInputEnable()),
       MaxMemoryClusterDWords(MFI.getMaxMemoryClusterDWords()),
-      Mode(MFI.getMode()) {
+      Mode(MFI.getMode()), HasInitWholeWave(MFI.hasInitWholeWave()) {
   for (Register Reg : MFI.getSGPRSpillPhysVGPRs())
     SpillPhysVGPRS.push_back(regToString(Reg, TRI));
 
diff --git a/llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll b/llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll
new file mode 100644
index 00000000000000..f3b8deff619181
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll
@@ -0,0 +1,17 @@
+; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1100 -stop-after=finalize-isel < %s | FileCheck --check-prefix=GCN %s
+; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1100 -stop-after=finalize-isel < %s | FileCheck --check-prefix=GCN %s
+
+; GCN-LABEL: name: init_wwm
+; GCN: hasInitWholeWave: true
+define void @init_wwm(ptr addrspace(1) inreg %p) {
+entry:
+  %entry_exec = call i1 @llvm.amdgcn.init.whole.wave()
+  br i1 %entry_exec, label %bb.1, label %bb.2
+
+bb.1:
+  store i32 1, ptr addrspace(1) %p
+  br label %bb.2
+
+bb.2:
+  ret void
+}

>From 7501423b29230f37273094e1b15e8bca0fcc90bd Mon Sep 17 00:00:00 2001
From: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: Thu, 16 Jan 2025 10:49:05 -0800
Subject: [PATCH 2/2] [AMDGPU] Add test for VALU hoisiting from WWM region.
 NFC.

The test demonstraits a suboptimal VALU hoisting from a WWM
region. As a result we have 2 WWM regions instead of one.
---
 llvm/test/CodeGen/AMDGPU/licm-wwm.mir | 46 +++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 llvm/test/CodeGen/AMDGPU/licm-wwm.mir

diff --git a/llvm/test/CodeGen/AMDGPU/licm-wwm.mir b/llvm/test/CodeGen/AMDGPU/licm-wwm.mir
new file mode 100644
index 00000000000000..fc20674971a716
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/licm-wwm.mir
@@ -0,0 +1,46 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -run-pass=early-machinelicm,si-wqm -o - %s | FileCheck -check-prefix=GCN %s
+
+# Machine LICM may hoist an intruction from a WWM region, which will force SI-WQM pass
+# to create a second WWM region. This is an unwanted hoisting.
+
+---
+name: licm_move_wwm
+tracksRegLiveness: true
+body:             |
+  ; GCN-LABEL: name: licm_move_wwm
+  ; GCN: bb.0:
+  ; GCN-NEXT:   successors: %bb.1(0x80000000)
+  ; GCN-NEXT: {{  $}}
+  ; GCN-NEXT:   [[ENTER_STRICT_WWM:%[0-9]+]]:sreg_32 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
+  ; GCN-NEXT:   [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
+  ; GCN-NEXT:   $exec_lo = EXIT_STRICT_WWM [[ENTER_STRICT_WWM]]
+  ; GCN-NEXT:   S_BRANCH %bb.1
+  ; GCN-NEXT: {{  $}}
+  ; GCN-NEXT: bb.1:
+  ; GCN-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; GCN-NEXT: {{  $}}
+  ; GCN-NEXT:   [[ENTER_STRICT_WWM1:%[0-9]+]]:sreg_32 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
+  ; GCN-NEXT:   [[V_READFIRSTLANE_B32_:%[0-9]+]]:sreg_32 = V_READFIRSTLANE_B32 [[V_MOV_B32_e32_]], implicit $exec
+  ; GCN-NEXT:   $exec_lo = EXIT_STRICT_WWM [[ENTER_STRICT_WWM1]]
+  ; GCN-NEXT:   [[COPY:%[0-9]+]]:sreg_32 = COPY [[V_READFIRSTLANE_B32_]]
+  ; GCN-NEXT:   $exec_lo = S_OR_B32 $exec_lo, [[COPY]], implicit-def $scc
+  ; GCN-NEXT:   S_CBRANCH_EXECNZ %bb.1, implicit $exec
+  ; GCN-NEXT:   S_BRANCH %bb.2
+  ; GCN-NEXT: {{  $}}
+  ; GCN-NEXT: bb.2:
+  ; GCN-NEXT:   S_ENDPGM 0
+  bb.0:
+    S_BRANCH %bb.1
+
+  bb.1:
+    %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
+    %1:sreg_32 = V_READFIRSTLANE_B32 killed %0:vgpr_32, implicit $exec
+    early-clobber %2:sreg_32 = STRICT_WWM killed %1:sreg_32, implicit $exec
+    $exec_lo = S_OR_B32 $exec_lo, %2, implicit-def $scc
+    S_CBRANCH_EXECNZ %bb.1, implicit $exec
+    S_BRANCH %bb.2
+
+  bb.2:
+    S_ENDPGM 0
+...



More information about the llvm-commits mailing list