[llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 17 03:01:59 PST 2025
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/123234
>From 676cfcbf4b131436e4bdf47a6afa604e26d8936a Mon Sep 17 00:00:00 2001
From: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: Thu, 16 Jan 2025 11:55:37 -0800
Subject: [PATCH 1/2] [AMDGPU] Fix printing hasInitWholeWave in mir
---
.../lib/Target/AMDGPU/SIMachineFunctionInfo.cpp | 2 +-
llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll | 17 +++++++++++++++++
2 files changed, 18 insertions(+), 1 deletion(-)
create mode 100644 llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll
diff --git a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
index 169f1369fb5433..7de64bddf78846 100644
--- a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
@@ -715,7 +715,7 @@ yaml::SIMachineFunctionInfo::SIMachineFunctionInfo(
ArgInfo(convertArgumentInfo(MFI.getArgInfo(), TRI)),
PSInputAddr(MFI.getPSInputAddr()), PSInputEnable(MFI.getPSInputEnable()),
MaxMemoryClusterDWords(MFI.getMaxMemoryClusterDWords()),
- Mode(MFI.getMode()) {
+ Mode(MFI.getMode()), HasInitWholeWave(MFI.hasInitWholeWave()) {
for (Register Reg : MFI.getSGPRSpillPhysVGPRs())
SpillPhysVGPRS.push_back(regToString(Reg, TRI));
diff --git a/llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll b/llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll
new file mode 100644
index 00000000000000..f3b8deff619181
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/AMDGPU/init-whole.wave.ll
@@ -0,0 +1,17 @@
+; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1100 -stop-after=finalize-isel < %s | FileCheck --check-prefix=GCN %s
+; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1100 -stop-after=finalize-isel < %s | FileCheck --check-prefix=GCN %s
+
+; GCN-LABEL: name: init_wwm
+; GCN: hasInitWholeWave: true
+define void @init_wwm(ptr addrspace(1) inreg %p) {
+entry:
+ %entry_exec = call i1 @llvm.amdgcn.init.whole.wave()
+ br i1 %entry_exec, label %bb.1, label %bb.2
+
+bb.1:
+ store i32 1, ptr addrspace(1) %p
+ br label %bb.2
+
+bb.2:
+ ret void
+}
>From 7501423b29230f37273094e1b15e8bca0fcc90bd Mon Sep 17 00:00:00 2001
From: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: Thu, 16 Jan 2025 10:49:05 -0800
Subject: [PATCH 2/2] [AMDGPU] Add test for VALU hoisiting from WWM region.
NFC.
The test demonstraits a suboptimal VALU hoisting from a WWM
region. As a result we have 2 WWM regions instead of one.
---
llvm/test/CodeGen/AMDGPU/licm-wwm.mir | 46 +++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
create mode 100644 llvm/test/CodeGen/AMDGPU/licm-wwm.mir
diff --git a/llvm/test/CodeGen/AMDGPU/licm-wwm.mir b/llvm/test/CodeGen/AMDGPU/licm-wwm.mir
new file mode 100644
index 00000000000000..fc20674971a716
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/licm-wwm.mir
@@ -0,0 +1,46 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -run-pass=early-machinelicm,si-wqm -o - %s | FileCheck -check-prefix=GCN %s
+
+# Machine LICM may hoist an intruction from a WWM region, which will force SI-WQM pass
+# to create a second WWM region. This is an unwanted hoisting.
+
+---
+name: licm_move_wwm
+tracksRegLiveness: true
+body: |
+ ; GCN-LABEL: name: licm_move_wwm
+ ; GCN: bb.0:
+ ; GCN-NEXT: successors: %bb.1(0x80000000)
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: [[ENTER_STRICT_WWM:%[0-9]+]]:sreg_32 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
+ ; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
+ ; GCN-NEXT: $exec_lo = EXIT_STRICT_WWM [[ENTER_STRICT_WWM]]
+ ; GCN-NEXT: S_BRANCH %bb.1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.1:
+ ; GCN-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: [[ENTER_STRICT_WWM1:%[0-9]+]]:sreg_32 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
+ ; GCN-NEXT: [[V_READFIRSTLANE_B32_:%[0-9]+]]:sreg_32 = V_READFIRSTLANE_B32 [[V_MOV_B32_e32_]], implicit $exec
+ ; GCN-NEXT: $exec_lo = EXIT_STRICT_WWM [[ENTER_STRICT_WWM1]]
+ ; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY [[V_READFIRSTLANE_B32_]]
+ ; GCN-NEXT: $exec_lo = S_OR_B32 $exec_lo, [[COPY]], implicit-def $scc
+ ; GCN-NEXT: S_CBRANCH_EXECNZ %bb.1, implicit $exec
+ ; GCN-NEXT: S_BRANCH %bb.2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.2:
+ ; GCN-NEXT: S_ENDPGM 0
+ bb.0:
+ S_BRANCH %bb.1
+
+ bb.1:
+ %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
+ %1:sreg_32 = V_READFIRSTLANE_B32 killed %0:vgpr_32, implicit $exec
+ early-clobber %2:sreg_32 = STRICT_WWM killed %1:sreg_32, implicit $exec
+ $exec_lo = S_OR_B32 $exec_lo, %2, implicit-def $scc
+ S_CBRANCH_EXECNZ %bb.1, implicit $exec
+ S_BRANCH %bb.2
+
+ bb.2:
+ S_ENDPGM 0
+...
More information about the llvm-commits
mailing list