[llvm] [AMDGPU] Set TGID_EN_X/Y/Z when cluster ID intrinsics are used (PR #159120)
Shilei Tian via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 16 09:27:00 PDT 2025
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/159120
Hardware initializes a single value in ttmp9 which is either the workgroup ID X or cluster ID X. Most of this patch is a refactoring to use a single `PreloadedValue` enumerator for this value, instead of two enumerators `WORKGROUP_ID_X` and `CLUSTER_ID_X` referring to the same value.
This makes it simpler to have a single attribute `amdgpu-no-workgroup-id-x` indicating that this value is not used, which in turns sets the TGID_EN_X bit appropriately to tell the hardware whether to initialize it.
All of the above applies to Y and Z similarly.
Fixes: LWPSCGFX13-568
Co-authored-by: Jay Foad <jay.foad at amd.com>
>From 22a3dbd660001442a9691f3a93fd122f40d9972f Mon Sep 17 00:00:00 2001
From: Shilei Tian <i at tianshilei.me>
Date: Tue, 16 Sep 2025 12:20:39 -0400
Subject: [PATCH] [AMDGPU] Set TGID_EN_X/Y/Z when cluster ID intrinsics are
used
Hardware initializes a single value in ttmp9 which is either the workgroup ID X or cluster ID X. Most of this patch is a refactoring to use a single `PreloadedValue` enumerator for this value, instead of two enumerators `WORKGROUP_ID_X` and `CLUSTER_ID_X` referring to the same value.
This makes it simpler to have a single attribute `amdgpu-no-workgroup-id-x` indicating that this value is not used, which in turns sets the TGID_EN_X bit appropriately to tell the hardware whether to initialize it.
All of the above applies to Y and Z similarly.
Fixes: LWPSCGFX13-568
Co-authored-by: Jay Foad <jay.foad at amd.com>
---
llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp | 22 +-
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 26 +-
.../Target/AMDGPU/SIMachineFunctionInfo.cpp | 9 +-
.../GlobalISel/call-outgoing-stack-args.ll | 2 +-
.../GlobalISel/dereferenceable-declaration.ll | 2 +-
.../irtranslator-call-abi-attribute-hints.ll | 8 +-
.../GlobalISel/irtranslator-sibling-call.ll | 2 +-
.../amdhsa-kernarg-preload-num-sgprs.ll | 2 +-
.../CodeGen/AMDGPU/call-argument-types.ll | 2 +-
.../CodeGen/AMDGPU/call-defs-mode-register.ll | 2 +-
.../AMDGPU/call-preserved-registers.ll | 2 +-
.../CodeGen/AMDGPU/call-reqd-group-size.ll | 2 +-
llvm/test/CodeGen/AMDGPU/call-waitcnt.ll | 2 +-
llvm/test/CodeGen/AMDGPU/elf-notes.ll | 2 +-
.../AMDGPU/gfx11-user-sgpr-init16-bug.ll | 6 +-
llvm/test/CodeGen/AMDGPU/hsa.ll | 4 +-
.../CodeGen/AMDGPU/llvm.amdgcn.cluster.id.ll | 554 ++++++++++++++++++
.../CodeGen/AMDGPU/mul24-pass-ordering.ll | 2 +-
llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll | 2 +-
...al-regcopy-and-spill-missed-at-regalloc.ll | 2 +-
.../AMDGPU/promote-constOffset-to-imm.ll | 2 +-
llvm/test/CodeGen/AMDGPU/sibling-call.ll | 2 +-
llvm/test/CodeGen/AMDGPU/spill-agpr.ll | 62 +-
...tack-pointer-offset-relative-frameindex.ll | 2 +-
llvm/test/CodeGen/AMDGPU/tid-kd-xnack-any.ll | 2 +-
llvm/test/CodeGen/AMDGPU/tid-kd-xnack-off.ll | 2 +-
llvm/test/CodeGen/AMDGPU/tid-kd-xnack-on.ll | 2 +-
.../AMDGPU/long-branch-reg-all-sgpr-used.ll | 2 +-
.../AMDGPU/machine-function-info-after-pei.ll | 2 +-
...ine-function-info-long-branch-reg-debug.ll | 2 +-
.../machine-function-info-long-branch-reg.ll | 2 +-
31 files changed, 652 insertions(+), 85 deletions(-)
create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cluster.id.ll
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
index 21255f691e4ad..7afaddea164f8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -809,15 +809,15 @@ bool AMDGPUCallLowering::passSpecialInputs(MachineIRBuilder &MIRBuilder,
AMDGPUFunctionArgInfo::LDS_KERNEL_ID,
};
- static constexpr StringLiteral ImplicitAttrNames[] = {
- "amdgpu-no-dispatch-ptr",
- "amdgpu-no-queue-ptr",
- "amdgpu-no-implicitarg-ptr",
- "amdgpu-no-dispatch-id",
- "amdgpu-no-workgroup-id-x",
- "amdgpu-no-workgroup-id-y",
- "amdgpu-no-workgroup-id-z",
- "amdgpu-no-lds-kernel-id",
+ static constexpr StringLiteral ImplicitAttrNames[][2] = {
+ {"amdgpu-no-dispatch-ptr", ""},
+ {"amdgpu-no-queue-ptr", ""},
+ {"amdgpu-no-implicitarg-ptr", ""},
+ {"amdgpu-no-dispatch-id", ""},
+ {"amdgpu-no-workgroup-id-x", "amdgpu-no-cluster-id-x"},
+ {"amdgpu-no-workgroup-id-y", "amdgpu-no-cluster-id-y"},
+ {"amdgpu-no-workgroup-id-z", "amdgpu-no-cluster-id-z"},
+ {"amdgpu-no-lds-kernel-id", ""},
};
MachineRegisterInfo &MRI = MF.getRegInfo();
@@ -833,7 +833,9 @@ bool AMDGPUCallLowering::passSpecialInputs(MachineIRBuilder &MIRBuilder,
LLT ArgTy;
// If the callee does not use the attribute value, skip copying the value.
- if (Info.CB->hasFnAttr(ImplicitAttrNames[I++]))
+ if (all_of(ImplicitAttrNames[I++], [&](StringRef AttrName) {
+ return AttrName.empty() || Info.CB->hasFnAttr(AttrName);
+ }))
continue;
std::tie(OutgoingArg, ArgRC, ArgTy) =
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 015f8fe49ebcf..363717b017ef0 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3793,21 +3793,23 @@ void SITargetLowering::passSpecialInputs(
// in the same location as the input.
// clang-format off
static constexpr std::pair<AMDGPUFunctionArgInfo::PreloadedValue,
- StringLiteral> ImplicitAttrs[] = {
- {AMDGPUFunctionArgInfo::DISPATCH_PTR, "amdgpu-no-dispatch-ptr"},
- {AMDGPUFunctionArgInfo::QUEUE_PTR, "amdgpu-no-queue-ptr" },
- {AMDGPUFunctionArgInfo::IMPLICIT_ARG_PTR, "amdgpu-no-implicitarg-ptr"},
- {AMDGPUFunctionArgInfo::DISPATCH_ID, "amdgpu-no-dispatch-id"},
- {AMDGPUFunctionArgInfo::WORKGROUP_ID_X, "amdgpu-no-workgroup-id-x"},
- {AMDGPUFunctionArgInfo::WORKGROUP_ID_Y,"amdgpu-no-workgroup-id-y"},
- {AMDGPUFunctionArgInfo::WORKGROUP_ID_Z,"amdgpu-no-workgroup-id-z"},
- {AMDGPUFunctionArgInfo::LDS_KERNEL_ID,"amdgpu-no-lds-kernel-id"},
- };
+ std::array<StringLiteral, 2>> ImplicitAttrs[] = {
+ {AMDGPUFunctionArgInfo::DISPATCH_PTR, {"amdgpu-no-dispatch-ptr", ""}},
+ {AMDGPUFunctionArgInfo::QUEUE_PTR, {"amdgpu-no-queue-ptr", ""}},
+ {AMDGPUFunctionArgInfo::IMPLICIT_ARG_PTR, {"amdgpu-no-implicitarg-ptr", ""}},
+ {AMDGPUFunctionArgInfo::DISPATCH_ID, {"amdgpu-no-dispatch-id", ""}},
+ {AMDGPUFunctionArgInfo::WORKGROUP_ID_X, {"amdgpu-no-workgroup-id-x", "amdgpu-no-cluster-id-x"}},
+ {AMDGPUFunctionArgInfo::WORKGROUP_ID_Y, {"amdgpu-no-workgroup-id-y", "amdgpu-no-cluster-id-y"}},
+ {AMDGPUFunctionArgInfo::WORKGROUP_ID_Z, {"amdgpu-no-workgroup-id-z", "amdgpu-no-cluster-id-z"}},
+ {AMDGPUFunctionArgInfo::LDS_KERNEL_ID, {"amdgpu-no-lds-kernel-id", ""}},
+ };
// clang-format on
- for (auto [InputID, Attr] : ImplicitAttrs) {
+ for (auto [InputID, Attrs] : ImplicitAttrs) {
// If the callee does not use the attribute value, skip copying the value.
- if (CLI.CB->hasFnAttr(Attr))
+ if (all_of(Attrs, [&](StringRef Attr) {
+ return Attr.empty() || CLI.CB->hasFnAttr(Attr);
+ }))
continue;
const auto [OutgoingArg, ArgRC, ArgTy] =
diff --git a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
index 1f11be475e9f8..908d856d386f5 100644
--- a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
@@ -132,13 +132,16 @@ SIMachineFunctionInfo::SIMachineFunctionInfo(const Function &F,
if (!AMDGPU::isGraphics(CC) ||
((CC == CallingConv::AMDGPU_CS || CC == CallingConv::AMDGPU_Gfx) &&
ST.hasArchitectedSGPRs())) {
- if (IsKernel || !F.hasFnAttribute("amdgpu-no-workgroup-id-x"))
+ if (IsKernel || !F.hasFnAttribute("amdgpu-no-workgroup-id-x") ||
+ !F.hasFnAttribute("amdgpu-no-cluster-id-x"))
WorkGroupIDX = true;
- if (!F.hasFnAttribute("amdgpu-no-workgroup-id-y"))
+ if (!F.hasFnAttribute("amdgpu-no-workgroup-id-y") ||
+ !F.hasFnAttribute("amdgpu-no-cluster-id-y"))
WorkGroupIDY = true;
- if (!F.hasFnAttribute("amdgpu-no-workgroup-id-z"))
+ if (!F.hasFnAttribute("amdgpu-no-workgroup-id-z") ||
+ !F.hasFnAttribute("amdgpu-no-cluster-id-z"))
WorkGroupIDZ = true;
}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
index 679d4a26d26b2..c16c8e2128c72 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
@@ -430,5 +430,5 @@ define void @func_caller_byval(ptr addrspace(5) %argptr) {
declare void @llvm.memset.p5.i32(ptr addrspace(5) nocapture writeonly, i8, i32, i1 immarg) #1
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #1 = { argmemonly nofree nounwind willreturn writeonly }
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/dereferenceable-declaration.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/dereferenceable-declaration.ll
index 13828c2d8a6a0..c92e5c5f8bfff 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/dereferenceable-declaration.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/dereferenceable-declaration.ll
@@ -232,4 +232,4 @@ define i64 @load_deref_or_null_maxmimum_callsite_declaration_only() {
ret i64 %add
}
-attributes #0 = { "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-abi-attribute-hints.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-abi-attribute-hints.ll
index bbbce9a0719ab..1bf2a589cb597 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-abi-attribute-hints.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-abi-attribute-hints.ll
@@ -88,7 +88,7 @@ define amdgpu_kernel void @kernel_call_no_workgroup_ids() {
; CHECK-NEXT: $sgpr30_sgpr31 = noconvergent G_SI_CALL [[GV]](p0), @extern, csr_amdgpu, implicit $sgpr0_sgpr1_sgpr2_sgpr3, implicit $sgpr4_sgpr5, implicit $sgpr6_sgpr7, implicit $sgpr8_sgpr9, implicit $sgpr10_sgpr11, implicit $sgpr15, implicit $vgpr31
; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $scc
; CHECK-NEXT: S_ENDPGM 0
- call void @extern() "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z"
+ call void @extern() "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z"
ret void
}
@@ -124,7 +124,7 @@ define amdgpu_kernel void @kernel_call_no_other_sgprs() {
; CHECK-NEXT: $sgpr30_sgpr31 = noconvergent G_SI_CALL [[GV]](p0), @extern, csr_amdgpu, implicit $sgpr0_sgpr1_sgpr2_sgpr3, implicit $sgpr8_sgpr9, implicit $sgpr15, implicit $vgpr31
; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $scc
; CHECK-NEXT: S_ENDPGM 0
- call void @extern() "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z"
+ call void @extern() "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z"
ret void
}
@@ -198,7 +198,7 @@ define void @func_call_no_workgroup_ids() {
; CHECK-NEXT: $sgpr30_sgpr31 = noconvergent G_SI_CALL [[GV]](p0), @extern, csr_amdgpu, implicit $sgpr0_sgpr1_sgpr2_sgpr3, implicit $sgpr4_sgpr5, implicit $sgpr6_sgpr7, implicit $sgpr8_sgpr9, implicit $sgpr10_sgpr11, implicit $sgpr15, implicit $vgpr31
; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $scc
; CHECK-NEXT: SI_RETURN
- call void @extern() "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z"
+ call void @extern() "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z"
ret void
}
@@ -223,7 +223,7 @@ define void @func_call_no_other_sgprs() {
; CHECK-NEXT: $sgpr30_sgpr31 = noconvergent G_SI_CALL [[GV]](p0), @extern, csr_amdgpu, implicit $sgpr0_sgpr1_sgpr2_sgpr3, implicit $sgpr8_sgpr9, implicit $sgpr15, implicit $vgpr31
; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def $scc
; CHECK-NEXT: SI_RETURN
- call void @extern() "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z"
+ call void @extern() "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z"
ret void
}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-sibling-call.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-sibling-call.ll
index 97c3e903c9aec..7b2e3bf13c368 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-sibling-call.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-sibling-call.ll
@@ -1486,5 +1486,5 @@ entry:
}
attributes #0 = { nounwind }
-attributes #1 = { nounwind noinline "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #1 = { nounwind noinline "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/amdhsa-kernarg-preload-num-sgprs.ll b/llvm/test/CodeGen/AMDGPU/amdhsa-kernarg-preload-num-sgprs.ll
index dd760c2a215ca..a160cdc950eb5 100644
--- a/llvm/test/CodeGen/AMDGPU/amdhsa-kernarg-preload-num-sgprs.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdhsa-kernarg-preload-num-sgprs.ll
@@ -70,4 +70,4 @@ define amdgpu_kernel void @amdhsa_kernarg_preload_1_implicit_2(i32 inreg) #0 { r
define amdgpu_kernel void @amdhsa_kernarg_preload_0_implicit_2(i32) #0 { ret void }
-attributes #0 = { "amdgpu-agpr-alloc"="0" "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" }
+attributes #0 = { "amdgpu-agpr-alloc"="0" "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" }
diff --git a/llvm/test/CodeGen/AMDGPU/call-argument-types.ll b/llvm/test/CodeGen/AMDGPU/call-argument-types.ll
index b8dd377377dab..306fe33bfb7ac 100644
--- a/llvm/test/CodeGen/AMDGPU/call-argument-types.ll
+++ b/llvm/test/CodeGen/AMDGPU/call-argument-types.ll
@@ -7283,7 +7283,7 @@ declare hidden void @external_void_func_12xv3f32(<3 x float>, <3 x float>, <3 x
declare hidden void @external_void_func_8xv5f32(<5 x float>, <5 x float>, <5 x float>, <5 x float>,
<5 x float>, <5 x float>, <5 x float>, <5 x float>) #0
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #1 = { nounwind readnone }
attributes #2 = { nounwind noinline }
diff --git a/llvm/test/CodeGen/AMDGPU/call-defs-mode-register.ll b/llvm/test/CodeGen/AMDGPU/call-defs-mode-register.ll
index 0c4974f347a8f..ffe536d347c53 100644
--- a/llvm/test/CodeGen/AMDGPU/call-defs-mode-register.ll
+++ b/llvm/test/CodeGen/AMDGPU/call-defs-mode-register.ll
@@ -54,4 +54,4 @@ define float @asm_changes_mode(float %x, float %y) #0 {
declare float @llvm.experimental.constrained.fadd.f32(float, float, metadata, metadata)
-attributes #0 = { strictfp "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { strictfp "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll b/llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
index 69ad8e96c7c5d..61a195f9c314f 100644
--- a/llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
+++ b/llvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
@@ -747,4 +747,4 @@ define amdgpu_kernel void @callee_saved_sgpr_vgpr_kernel() #2 {
attributes #0 = { nounwind }
attributes #1 = { nounwind readnone }
attributes #2 = { nounwind noinline }
-attributes #3 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #3 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/call-reqd-group-size.ll b/llvm/test/CodeGen/AMDGPU/call-reqd-group-size.ll
index 093ca55698fe3..33eb8c1e8f4f5 100644
--- a/llvm/test/CodeGen/AMDGPU/call-reqd-group-size.ll
+++ b/llvm/test/CodeGen/AMDGPU/call-reqd-group-size.ll
@@ -208,7 +208,7 @@ define amdgpu_kernel void @known_xyz_0(ptr addrspace(1) %out) !reqd_work_group_s
}
; CHECK: .amdhsa_system_vgpr_workitem_id 0
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" }
!0 = !{i32 1, i32 64, i32 64}
!1 = !{i32 64, i32 1, i32 64}
diff --git a/llvm/test/CodeGen/AMDGPU/call-waitcnt.ll b/llvm/test/CodeGen/AMDGPU/call-waitcnt.ll
index 9abb50651146a..675acd0eedfc5 100644
--- a/llvm/test/CodeGen/AMDGPU/call-waitcnt.ll
+++ b/llvm/test/CodeGen/AMDGPU/call-waitcnt.ll
@@ -154,5 +154,5 @@ declare hidden void @func(i32) #0
declare hidden i32 @func.return(i32) #0
declare void @got.func(i32) #0
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/elf-notes.ll b/llvm/test/CodeGen/AMDGPU/elf-notes.ll
index 22d699a8f4809..d0dec1f1fe7e4 100644
--- a/llvm/test/CodeGen/AMDGPU/elf-notes.ll
+++ b/llvm/test/CodeGen/AMDGPU/elf-notes.ll
@@ -85,7 +85,7 @@ define amdgpu_kernel void @elf_notes() #0 {
ret void
}
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"amdhsa_code_object_version", i32 400}
diff --git a/llvm/test/CodeGen/AMDGPU/gfx11-user-sgpr-init16-bug.ll b/llvm/test/CodeGen/AMDGPU/gfx11-user-sgpr-init16-bug.ll
index 63376def3d7e1..fa8fdbaeacf41 100644
--- a/llvm/test/CodeGen/AMDGPU/gfx11-user-sgpr-init16-bug.ll
+++ b/llvm/test/CodeGen/AMDGPU/gfx11-user-sgpr-init16-bug.ll
@@ -200,8 +200,8 @@ declare align 4 ptr addrspace(4) @llvm.amdgcn.queue.ptr() #3
declare align 4 ptr addrspace(4) @llvm.amdgcn.kernarg.segment.ptr() #3
declare i64 @llvm.amdgcn.dispatch.id() #3
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
-attributes #1 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
-attributes #2 = { "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-workgroup-id-x" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #1 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #2 = { "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #3 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
diff --git a/llvm/test/CodeGen/AMDGPU/hsa.ll b/llvm/test/CodeGen/AMDGPU/hsa.ll
index 024593c49dba1..e3c383c697030 100644
--- a/llvm/test/CodeGen/AMDGPU/hsa.ll
+++ b/llvm/test/CodeGen/AMDGPU/hsa.ll
@@ -43,7 +43,7 @@
; ELF: 00E0: 6E616D65 A673696D 706C65BB 2E707269
; ELF: 00F0: 76617465 5F736567 6D656E74 5F666978
; ELF: 0100: 65645F73 697A6500 AB2E7367 70725F63
-; ELF: 0110: 6F756E74 0EB12E73 6770725F 7370696C
+; ELF: 0110: 6F756E74 10B12E73 6770725F 7370696C
; ELF: 0120: 6C5F636F 756E7400 A72E7379 6D626F6C
; ELF: 0130: A973696D 706C652E 6B64AB2E 76677072
; ELF: 0140: 5F636F75 6E7403B1 2E766770 725F7370
@@ -59,7 +59,7 @@
; ELF: 01E0: 73696D70 6C655F6E 6F5F6B65 726E6172
; ELF: 01F0: 6773BB2E 70726976 6174655F 7365676D
; ELF: 0200: 656E745F 66697865 645F7369 7A6500AB
-; ELF: 0210: 2E736770 725F636F 756E740C B12E7367
+; ELF: 0210: 2E736770 725F636F 756E740E B12E7367
; ELF: 0220: 70725F73 70696C6C 5F636F75 6E7400A7
; ELF: 0230: 2E73796D 626F6CB5 73696D70 6C655F6E
; ELF: 0240: 6F5F6B65 726E6172 67732E6B 64AB2E76
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cluster.id.ll b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cluster.id.ll
new file mode 100644
index 0000000000000..90fcb5191c353
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cluster.id.ll
@@ -0,0 +1,554 @@
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-attributor %s -o %t.bc
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1250 %t.bc -o - | FileCheck --check-prefixes=CHECK-UNKNOWN %s
+; RUN: llc -mtriple=amdgcn-unknown-mesa3d -mcpu=gfx1250 %t.bc -o - | FileCheck -check-prefixes=CHECK-MESA3D %s
+; RUN: llc -global-isel -mtriple=amdgcn -mcpu=gfx1250 %t.bc -o - | FileCheck --check-prefixes=CHECK-G-UNKNOWN %s
+; RUN: llc -global-isel -mtriple=amdgcn-unknown-mesa3d -mcpu=gfx1250 %t.bc -o - | FileCheck -check-prefixes=CHECK-G-MESA3D %s
+
+declare i32 @llvm.amdgcn.cluster.id.x() #0
+declare i32 @llvm.amdgcn.cluster.id.y() #0
+declare i32 @llvm.amdgcn.cluster.id.z() #0
+
+define amdgpu_kernel void @test_cluster_id_x(ptr addrspace(1) %out) {
+; CHECK-UNKNOWN-LABEL: test_cluster_id_x:
+; CHECK-UNKNOWN: ; %bb.0:
+; CHECK-UNKNOWN-NEXT: s_load_b64 s[0:1], s[0:1], 0x24
+; CHECK-UNKNOWN-NEXT: v_dual_mov_b32 v0, ttmp9 :: v_dual_mov_b32 v1, 0
+; CHECK-UNKNOWN-NEXT: s_wait_kmcnt 0x0
+; CHECK-UNKNOWN-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-UNKNOWN-NEXT: s_endpgm
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_X_EN: 1
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
+;
+; CHECK-MESA3D-LABEL: test_cluster_id_x:
+; CHECK-MESA3D: .amd_kernel_code_t
+; CHECK-MESA3D-NEXT: amd_code_version_major = 1
+; CHECK-MESA3D-NEXT: amd_code_version_minor = 2
+; CHECK-MESA3D-NEXT: amd_machine_kind = 1
+; CHECK-MESA3D-NEXT: amd_machine_version_major = 12
+; CHECK-MESA3D-NEXT: amd_machine_version_minor = 5
+; CHECK-MESA3D-NEXT: amd_machine_version_stepping = 0
+; CHECK-MESA3D-NEXT: kernel_code_entry_byte_offset = 256
+; CHECK-MESA3D-NEXT: kernel_code_prefetch_byte_size = 0
+; CHECK-MESA3D-NEXT: granulated_workitem_vgpr_count = 0
+; CHECK-MESA3D-NEXT: granulated_wavefront_sgpr_count = 0
+; CHECK-MESA3D-NEXT: priority = 0
+; CHECK-MESA3D-NEXT: float_mode = 240
+; CHECK-MESA3D-NEXT: priv = 0
+; CHECK-MESA3D-NEXT: enable_dx10_clamp = 0
+; CHECK-MESA3D-NEXT: debug_mode = 0
+; CHECK-MESA3D-NEXT: enable_ieee_mode = 0
+; CHECK-MESA3D-NEXT: enable_wgp_mode = 0
+; CHECK-MESA3D-NEXT: enable_mem_ordered = 1
+; CHECK-MESA3D-NEXT: enable_fwd_progress = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_wave_byte_offset = 0
+; CHECK-MESA3D-NEXT: user_sgpr_count = 2
+; CHECK-MESA3D-NEXT: enable_trap_handler = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_x = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_y = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_z = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_info = 0
+; CHECK-MESA3D-NEXT: enable_vgpr_workitem_id = 0
+; CHECK-MESA3D-NEXT: enable_exception_msb = 0
+; CHECK-MESA3D-NEXT: granulated_lds_size = 0
+; CHECK-MESA3D-NEXT: enable_exception = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_buffer = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_dispatch_ptr = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_queue_ptr = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_kernarg_segment_ptr = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_dispatch_id = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_flat_scratch_init = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_size = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_x = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_y = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_z = 0
+; CHECK-MESA3D-NEXT: enable_wavefront_size32 = 1
+; CHECK-MESA3D-NEXT: enable_ordered_append_gds = 0
+; CHECK-MESA3D-NEXT: private_element_size = 1
+; CHECK-MESA3D-NEXT: is_ptr64 = 1
+; CHECK-MESA3D-NEXT: is_dynamic_callstack = 0
+; CHECK-MESA3D-NEXT: is_debug_enabled = 0
+; CHECK-MESA3D-NEXT: is_xnack_enabled = 0
+; CHECK-MESA3D-NEXT: workitem_private_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: workgroup_group_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: gds_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: kernarg_segment_byte_size = 8
+; CHECK-MESA3D-NEXT: workgroup_fbarrier_count = 0
+; CHECK-MESA3D-NEXT: wavefront_sgpr_count = 2
+; CHECK-MESA3D-NEXT: workitem_vgpr_count = 2
+; CHECK-MESA3D-NEXT: reserved_vgpr_first = 0
+; CHECK-MESA3D-NEXT: reserved_vgpr_count = 0
+; CHECK-MESA3D-NEXT: reserved_sgpr_first = 0
+; CHECK-MESA3D-NEXT: reserved_sgpr_count = 0
+; CHECK-MESA3D-NEXT: debug_wavefront_private_segment_offset_sgpr = 0
+; CHECK-MESA3D-NEXT: debug_private_segment_buffer_sgpr = 0
+; CHECK-MESA3D-NEXT: kernarg_segment_alignment = 4
+; CHECK-MESA3D-NEXT: group_segment_alignment = 4
+; CHECK-MESA3D-NEXT: private_segment_alignment = 4
+; CHECK-MESA3D-NEXT: wavefront_size = 5
+; CHECK-MESA3D-NEXT: call_convention = -1
+; CHECK-MESA3D-NEXT: runtime_loader_kernel_symbol = 0
+; CHECK-MESA3D-NEXT: .end_amd_kernel_code_t
+; CHECK-MESA3D-NEXT: ; %bb.0:
+; CHECK-MESA3D-NEXT: s_load_b64 s[0:1], s[0:1], 0x0
+; CHECK-MESA3D-NEXT: v_dual_mov_b32 v0, ttmp9 :: v_dual_mov_b32 v1, 0
+; CHECK-MESA3D-NEXT: s_wait_kmcnt 0x0
+; CHECK-MESA3D-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-MESA3D-NEXT: s_endpgm
+;
+; CHECK-G-UNKNOWN-LABEL: test_cluster_id_x:
+; CHECK-G-UNKNOWN: ; %bb.0:
+; CHECK-G-UNKNOWN-NEXT: s_load_b64 s[0:1], s[0:1], 0x24
+; CHECK-G-UNKNOWN-NEXT: v_dual_mov_b32 v0, ttmp9 :: v_dual_mov_b32 v1, 0
+; CHECK-G-UNKNOWN-NEXT: s_wait_kmcnt 0x0
+; CHECK-G-UNKNOWN-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-G-UNKNOWN-NEXT: s_endpgm
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_X_EN: 1
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
+;
+; CHECK-G-MESA3D-LABEL: test_cluster_id_x:
+; CHECK-G-MESA3D: .amd_kernel_code_t
+; CHECK-G-MESA3D-NEXT: amd_code_version_major = 1
+; CHECK-G-MESA3D-NEXT: amd_code_version_minor = 2
+; CHECK-G-MESA3D-NEXT: amd_machine_kind = 1
+; CHECK-G-MESA3D-NEXT: amd_machine_version_major = 12
+; CHECK-G-MESA3D-NEXT: amd_machine_version_minor = 5
+; CHECK-G-MESA3D-NEXT: amd_machine_version_stepping = 0
+; CHECK-G-MESA3D-NEXT: kernel_code_entry_byte_offset = 256
+; CHECK-G-MESA3D-NEXT: kernel_code_prefetch_byte_size = 0
+; CHECK-G-MESA3D-NEXT: granulated_workitem_vgpr_count = 0
+; CHECK-G-MESA3D-NEXT: granulated_wavefront_sgpr_count = 0
+; CHECK-G-MESA3D-NEXT: priority = 0
+; CHECK-G-MESA3D-NEXT: float_mode = 240
+; CHECK-G-MESA3D-NEXT: priv = 0
+; CHECK-G-MESA3D-NEXT: enable_dx10_clamp = 0
+; CHECK-G-MESA3D-NEXT: debug_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_ieee_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_wgp_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_mem_ordered = 1
+; CHECK-G-MESA3D-NEXT: enable_fwd_progress = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_wave_byte_offset = 0
+; CHECK-G-MESA3D-NEXT: user_sgpr_count = 2
+; CHECK-G-MESA3D-NEXT: enable_trap_handler = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_x = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_y = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_z = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_info = 0
+; CHECK-G-MESA3D-NEXT: enable_vgpr_workitem_id = 0
+; CHECK-G-MESA3D-NEXT: enable_exception_msb = 0
+; CHECK-G-MESA3D-NEXT: granulated_lds_size = 0
+; CHECK-G-MESA3D-NEXT: enable_exception = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_buffer = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_dispatch_ptr = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_queue_ptr = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_kernarg_segment_ptr = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_dispatch_id = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_flat_scratch_init = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_size = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_x = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_y = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_z = 0
+; CHECK-G-MESA3D-NEXT: enable_wavefront_size32 = 1
+; CHECK-G-MESA3D-NEXT: enable_ordered_append_gds = 0
+; CHECK-G-MESA3D-NEXT: private_element_size = 1
+; CHECK-G-MESA3D-NEXT: is_ptr64 = 1
+; CHECK-G-MESA3D-NEXT: is_dynamic_callstack = 0
+; CHECK-G-MESA3D-NEXT: is_debug_enabled = 0
+; CHECK-G-MESA3D-NEXT: is_xnack_enabled = 0
+; CHECK-G-MESA3D-NEXT: workitem_private_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: workgroup_group_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: gds_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: kernarg_segment_byte_size = 8
+; CHECK-G-MESA3D-NEXT: workgroup_fbarrier_count = 0
+; CHECK-G-MESA3D-NEXT: wavefront_sgpr_count = 2
+; CHECK-G-MESA3D-NEXT: workitem_vgpr_count = 2
+; CHECK-G-MESA3D-NEXT: reserved_vgpr_first = 0
+; CHECK-G-MESA3D-NEXT: reserved_vgpr_count = 0
+; CHECK-G-MESA3D-NEXT: reserved_sgpr_first = 0
+; CHECK-G-MESA3D-NEXT: reserved_sgpr_count = 0
+; CHECK-G-MESA3D-NEXT: debug_wavefront_private_segment_offset_sgpr = 0
+; CHECK-G-MESA3D-NEXT: debug_private_segment_buffer_sgpr = 0
+; CHECK-G-MESA3D-NEXT: kernarg_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: group_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: private_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: wavefront_size = 5
+; CHECK-G-MESA3D-NEXT: call_convention = -1
+; CHECK-G-MESA3D-NEXT: runtime_loader_kernel_symbol = 0
+; CHECK-G-MESA3D-NEXT: .end_amd_kernel_code_t
+; CHECK-G-MESA3D-NEXT: ; %bb.0:
+; CHECK-G-MESA3D-NEXT: s_load_b64 s[0:1], s[0:1], 0x0
+; CHECK-G-MESA3D-NEXT: v_dual_mov_b32 v0, ttmp9 :: v_dual_mov_b32 v1, 0
+; CHECK-G-MESA3D-NEXT: s_wait_kmcnt 0x0
+; CHECK-G-MESA3D-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-G-MESA3D-NEXT: s_endpgm
+ %id = call i32 @llvm.amdgcn.cluster.id.x()
+ store i32 %id, ptr addrspace(1) %out
+ ret void
+}
+
+define amdgpu_kernel void @test_cluster_id_y(ptr addrspace(1) %out) #1 {
+; CHECK-UNKNOWN-LABEL: test_cluster_id_y:
+; CHECK-UNKNOWN: ; %bb.0:
+; CHECK-UNKNOWN-NEXT: s_load_b64 s[0:1], s[0:1], 0x24
+; CHECK-UNKNOWN-NEXT: v_dual_mov_b32 v0, ttmp7 :: v_dual_mov_b32 v1, 0
+; CHECK-UNKNOWN-NEXT: s_wait_kmcnt 0x0
+; CHECK-UNKNOWN-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-UNKNOWN-NEXT: s_endpgm
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_X_EN: 1
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Y_EN: 1
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
+;
+; CHECK-MESA3D-LABEL: test_cluster_id_y:
+; CHECK-MESA3D: .amd_kernel_code_t
+; CHECK-MESA3D-NEXT: amd_code_version_major = 1
+; CHECK-MESA3D-NEXT: amd_code_version_minor = 2
+; CHECK-MESA3D-NEXT: amd_machine_kind = 1
+; CHECK-MESA3D-NEXT: amd_machine_version_major = 12
+; CHECK-MESA3D-NEXT: amd_machine_version_minor = 5
+; CHECK-MESA3D-NEXT: amd_machine_version_stepping = 0
+; CHECK-MESA3D-NEXT: kernel_code_entry_byte_offset = 256
+; CHECK-MESA3D-NEXT: kernel_code_prefetch_byte_size = 0
+; CHECK-MESA3D-NEXT: granulated_workitem_vgpr_count = 0
+; CHECK-MESA3D-NEXT: granulated_wavefront_sgpr_count = 0
+; CHECK-MESA3D-NEXT: priority = 0
+; CHECK-MESA3D-NEXT: float_mode = 240
+; CHECK-MESA3D-NEXT: priv = 0
+; CHECK-MESA3D-NEXT: enable_dx10_clamp = 0
+; CHECK-MESA3D-NEXT: debug_mode = 0
+; CHECK-MESA3D-NEXT: enable_ieee_mode = 0
+; CHECK-MESA3D-NEXT: enable_wgp_mode = 0
+; CHECK-MESA3D-NEXT: enable_mem_ordered = 1
+; CHECK-MESA3D-NEXT: enable_fwd_progress = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_wave_byte_offset = 0
+; CHECK-MESA3D-NEXT: user_sgpr_count = 2
+; CHECK-MESA3D-NEXT: enable_trap_handler = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_x = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_y = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_z = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_info = 0
+; CHECK-MESA3D-NEXT: enable_vgpr_workitem_id = 0
+; CHECK-MESA3D-NEXT: enable_exception_msb = 0
+; CHECK-MESA3D-NEXT: granulated_lds_size = 0
+; CHECK-MESA3D-NEXT: enable_exception = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_buffer = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_dispatch_ptr = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_queue_ptr = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_kernarg_segment_ptr = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_dispatch_id = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_flat_scratch_init = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_size = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_x = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_y = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_z = 0
+; CHECK-MESA3D-NEXT: enable_wavefront_size32 = 1
+; CHECK-MESA3D-NEXT: enable_ordered_append_gds = 0
+; CHECK-MESA3D-NEXT: private_element_size = 1
+; CHECK-MESA3D-NEXT: is_ptr64 = 1
+; CHECK-MESA3D-NEXT: is_dynamic_callstack = 0
+; CHECK-MESA3D-NEXT: is_debug_enabled = 0
+; CHECK-MESA3D-NEXT: is_xnack_enabled = 0
+; CHECK-MESA3D-NEXT: workitem_private_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: workgroup_group_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: gds_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: kernarg_segment_byte_size = 8
+; CHECK-MESA3D-NEXT: workgroup_fbarrier_count = 0
+; CHECK-MESA3D-NEXT: wavefront_sgpr_count = 2
+; CHECK-MESA3D-NEXT: workitem_vgpr_count = 2
+; CHECK-MESA3D-NEXT: reserved_vgpr_first = 0
+; CHECK-MESA3D-NEXT: reserved_vgpr_count = 0
+; CHECK-MESA3D-NEXT: reserved_sgpr_first = 0
+; CHECK-MESA3D-NEXT: reserved_sgpr_count = 0
+; CHECK-MESA3D-NEXT: debug_wavefront_private_segment_offset_sgpr = 0
+; CHECK-MESA3D-NEXT: debug_private_segment_buffer_sgpr = 0
+; CHECK-MESA3D-NEXT: kernarg_segment_alignment = 4
+; CHECK-MESA3D-NEXT: group_segment_alignment = 4
+; CHECK-MESA3D-NEXT: private_segment_alignment = 4
+; CHECK-MESA3D-NEXT: wavefront_size = 5
+; CHECK-MESA3D-NEXT: call_convention = -1
+; CHECK-MESA3D-NEXT: runtime_loader_kernel_symbol = 0
+; CHECK-MESA3D-NEXT: .end_amd_kernel_code_t
+; CHECK-MESA3D-NEXT: ; %bb.0:
+; CHECK-MESA3D-NEXT: s_load_b64 s[0:1], s[0:1], 0x0
+; CHECK-MESA3D-NEXT: v_dual_mov_b32 v0, ttmp7 :: v_dual_mov_b32 v1, 0
+; CHECK-MESA3D-NEXT: s_wait_kmcnt 0x0
+; CHECK-MESA3D-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-MESA3D-NEXT: s_endpgm
+;
+; CHECK-G-UNKNOWN-LABEL: test_cluster_id_y:
+; CHECK-G-UNKNOWN: ; %bb.0:
+; CHECK-G-UNKNOWN-NEXT: s_load_b64 s[0:1], s[0:1], 0x24
+; CHECK-G-UNKNOWN-NEXT: v_dual_mov_b32 v0, ttmp7 :: v_dual_mov_b32 v1, 0
+; CHECK-G-UNKNOWN-NEXT: s_wait_kmcnt 0x0
+; CHECK-G-UNKNOWN-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-G-UNKNOWN-NEXT: s_endpgm
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_X_EN: 1
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Y_EN: 1
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
+;
+; CHECK-G-MESA3D-LABEL: test_cluster_id_y:
+; CHECK-G-MESA3D: .amd_kernel_code_t
+; CHECK-G-MESA3D-NEXT: amd_code_version_major = 1
+; CHECK-G-MESA3D-NEXT: amd_code_version_minor = 2
+; CHECK-G-MESA3D-NEXT: amd_machine_kind = 1
+; CHECK-G-MESA3D-NEXT: amd_machine_version_major = 12
+; CHECK-G-MESA3D-NEXT: amd_machine_version_minor = 5
+; CHECK-G-MESA3D-NEXT: amd_machine_version_stepping = 0
+; CHECK-G-MESA3D-NEXT: kernel_code_entry_byte_offset = 256
+; CHECK-G-MESA3D-NEXT: kernel_code_prefetch_byte_size = 0
+; CHECK-G-MESA3D-NEXT: granulated_workitem_vgpr_count = 0
+; CHECK-G-MESA3D-NEXT: granulated_wavefront_sgpr_count = 0
+; CHECK-G-MESA3D-NEXT: priority = 0
+; CHECK-G-MESA3D-NEXT: float_mode = 240
+; CHECK-G-MESA3D-NEXT: priv = 0
+; CHECK-G-MESA3D-NEXT: enable_dx10_clamp = 0
+; CHECK-G-MESA3D-NEXT: debug_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_ieee_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_wgp_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_mem_ordered = 1
+; CHECK-G-MESA3D-NEXT: enable_fwd_progress = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_wave_byte_offset = 0
+; CHECK-G-MESA3D-NEXT: user_sgpr_count = 2
+; CHECK-G-MESA3D-NEXT: enable_trap_handler = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_x = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_y = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_z = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_info = 0
+; CHECK-G-MESA3D-NEXT: enable_vgpr_workitem_id = 0
+; CHECK-G-MESA3D-NEXT: enable_exception_msb = 0
+; CHECK-G-MESA3D-NEXT: granulated_lds_size = 0
+; CHECK-G-MESA3D-NEXT: enable_exception = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_buffer = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_dispatch_ptr = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_queue_ptr = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_kernarg_segment_ptr = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_dispatch_id = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_flat_scratch_init = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_size = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_x = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_y = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_z = 0
+; CHECK-G-MESA3D-NEXT: enable_wavefront_size32 = 1
+; CHECK-G-MESA3D-NEXT: enable_ordered_append_gds = 0
+; CHECK-G-MESA3D-NEXT: private_element_size = 1
+; CHECK-G-MESA3D-NEXT: is_ptr64 = 1
+; CHECK-G-MESA3D-NEXT: is_dynamic_callstack = 0
+; CHECK-G-MESA3D-NEXT: is_debug_enabled = 0
+; CHECK-G-MESA3D-NEXT: is_xnack_enabled = 0
+; CHECK-G-MESA3D-NEXT: workitem_private_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: workgroup_group_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: gds_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: kernarg_segment_byte_size = 8
+; CHECK-G-MESA3D-NEXT: workgroup_fbarrier_count = 0
+; CHECK-G-MESA3D-NEXT: wavefront_sgpr_count = 2
+; CHECK-G-MESA3D-NEXT: workitem_vgpr_count = 2
+; CHECK-G-MESA3D-NEXT: reserved_vgpr_first = 0
+; CHECK-G-MESA3D-NEXT: reserved_vgpr_count = 0
+; CHECK-G-MESA3D-NEXT: reserved_sgpr_first = 0
+; CHECK-G-MESA3D-NEXT: reserved_sgpr_count = 0
+; CHECK-G-MESA3D-NEXT: debug_wavefront_private_segment_offset_sgpr = 0
+; CHECK-G-MESA3D-NEXT: debug_private_segment_buffer_sgpr = 0
+; CHECK-G-MESA3D-NEXT: kernarg_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: group_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: private_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: wavefront_size = 5
+; CHECK-G-MESA3D-NEXT: call_convention = -1
+; CHECK-G-MESA3D-NEXT: runtime_loader_kernel_symbol = 0
+; CHECK-G-MESA3D-NEXT: .end_amd_kernel_code_t
+; CHECK-G-MESA3D-NEXT: ; %bb.0:
+; CHECK-G-MESA3D-NEXT: s_load_b64 s[0:1], s[0:1], 0x0
+; CHECK-G-MESA3D-NEXT: v_dual_mov_b32 v0, ttmp7 :: v_dual_mov_b32 v1, 0
+; CHECK-G-MESA3D-NEXT: s_wait_kmcnt 0x0
+; CHECK-G-MESA3D-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-G-MESA3D-NEXT: s_endpgm
+ %id = call i32 @llvm.amdgcn.cluster.id.y()
+ store i32 %id, ptr addrspace(1) %out
+ ret void
+}
+
+define amdgpu_kernel void @test_cluster_id_z(ptr addrspace(1) %out) #1 {
+; CHECK-UNKNOWN-LABEL: test_cluster_id_z:
+; CHECK-UNKNOWN: ; %bb.0:
+; CHECK-UNKNOWN-NEXT: s_load_b64 s[0:1], s[0:1], 0x24
+; CHECK-UNKNOWN-NEXT: s_lshr_b32 s2, ttmp7, 16
+; CHECK-UNKNOWN-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; CHECK-UNKNOWN-NEXT: v_dual_mov_b32 v0, 0 :: v_dual_mov_b32 v1, s2
+; CHECK-UNKNOWN-NEXT: s_wait_kmcnt 0x0
+; CHECK-UNKNOWN-NEXT: global_store_b32 v0, v1, s[0:1]
+; CHECK-UNKNOWN-NEXT: s_endpgm
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_X_EN: 1
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
+; CHECK-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Z_EN: 1
+;
+; CHECK-MESA3D-LABEL: test_cluster_id_z:
+; CHECK-MESA3D: .amd_kernel_code_t
+; CHECK-MESA3D-NEXT: amd_code_version_major = 1
+; CHECK-MESA3D-NEXT: amd_code_version_minor = 2
+; CHECK-MESA3D-NEXT: amd_machine_kind = 1
+; CHECK-MESA3D-NEXT: amd_machine_version_major = 12
+; CHECK-MESA3D-NEXT: amd_machine_version_minor = 5
+; CHECK-MESA3D-NEXT: amd_machine_version_stepping = 0
+; CHECK-MESA3D-NEXT: kernel_code_entry_byte_offset = 256
+; CHECK-MESA3D-NEXT: kernel_code_prefetch_byte_size = 0
+; CHECK-MESA3D-NEXT: granulated_workitem_vgpr_count = 0
+; CHECK-MESA3D-NEXT: granulated_wavefront_sgpr_count = 0
+; CHECK-MESA3D-NEXT: priority = 0
+; CHECK-MESA3D-NEXT: float_mode = 240
+; CHECK-MESA3D-NEXT: priv = 0
+; CHECK-MESA3D-NEXT: enable_dx10_clamp = 0
+; CHECK-MESA3D-NEXT: debug_mode = 0
+; CHECK-MESA3D-NEXT: enable_ieee_mode = 0
+; CHECK-MESA3D-NEXT: enable_wgp_mode = 0
+; CHECK-MESA3D-NEXT: enable_mem_ordered = 1
+; CHECK-MESA3D-NEXT: enable_fwd_progress = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_wave_byte_offset = 0
+; CHECK-MESA3D-NEXT: user_sgpr_count = 2
+; CHECK-MESA3D-NEXT: enable_trap_handler = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_x = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_y = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_id_z = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_workgroup_info = 0
+; CHECK-MESA3D-NEXT: enable_vgpr_workitem_id = 0
+; CHECK-MESA3D-NEXT: enable_exception_msb = 0
+; CHECK-MESA3D-NEXT: granulated_lds_size = 0
+; CHECK-MESA3D-NEXT: enable_exception = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_buffer = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_dispatch_ptr = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_queue_ptr = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_kernarg_segment_ptr = 1
+; CHECK-MESA3D-NEXT: enable_sgpr_dispatch_id = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_flat_scratch_init = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_private_segment_size = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_x = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_y = 0
+; CHECK-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_z = 0
+; CHECK-MESA3D-NEXT: enable_wavefront_size32 = 1
+; CHECK-MESA3D-NEXT: enable_ordered_append_gds = 0
+; CHECK-MESA3D-NEXT: private_element_size = 1
+; CHECK-MESA3D-NEXT: is_ptr64 = 1
+; CHECK-MESA3D-NEXT: is_dynamic_callstack = 0
+; CHECK-MESA3D-NEXT: is_debug_enabled = 0
+; CHECK-MESA3D-NEXT: is_xnack_enabled = 0
+; CHECK-MESA3D-NEXT: workitem_private_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: workgroup_group_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: gds_segment_byte_size = 0
+; CHECK-MESA3D-NEXT: kernarg_segment_byte_size = 8
+; CHECK-MESA3D-NEXT: workgroup_fbarrier_count = 0
+; CHECK-MESA3D-NEXT: wavefront_sgpr_count = 3
+; CHECK-MESA3D-NEXT: workitem_vgpr_count = 2
+; CHECK-MESA3D-NEXT: reserved_vgpr_first = 0
+; CHECK-MESA3D-NEXT: reserved_vgpr_count = 0
+; CHECK-MESA3D-NEXT: reserved_sgpr_first = 0
+; CHECK-MESA3D-NEXT: reserved_sgpr_count = 0
+; CHECK-MESA3D-NEXT: debug_wavefront_private_segment_offset_sgpr = 0
+; CHECK-MESA3D-NEXT: debug_private_segment_buffer_sgpr = 0
+; CHECK-MESA3D-NEXT: kernarg_segment_alignment = 4
+; CHECK-MESA3D-NEXT: group_segment_alignment = 4
+; CHECK-MESA3D-NEXT: private_segment_alignment = 4
+; CHECK-MESA3D-NEXT: wavefront_size = 5
+; CHECK-MESA3D-NEXT: call_convention = -1
+; CHECK-MESA3D-NEXT: runtime_loader_kernel_symbol = 0
+; CHECK-MESA3D-NEXT: .end_amd_kernel_code_t
+; CHECK-MESA3D-NEXT: ; %bb.0:
+; CHECK-MESA3D-NEXT: s_load_b64 s[0:1], s[0:1], 0x0
+; CHECK-MESA3D-NEXT: s_lshr_b32 s2, ttmp7, 16
+; CHECK-MESA3D-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; CHECK-MESA3D-NEXT: v_dual_mov_b32 v0, 0 :: v_dual_mov_b32 v1, s2
+; CHECK-MESA3D-NEXT: s_wait_kmcnt 0x0
+; CHECK-MESA3D-NEXT: global_store_b32 v0, v1, s[0:1]
+; CHECK-MESA3D-NEXT: s_endpgm
+;
+; CHECK-G-UNKNOWN-LABEL: test_cluster_id_z:
+; CHECK-G-UNKNOWN: ; %bb.0:
+; CHECK-G-UNKNOWN-NEXT: s_load_b64 s[0:1], s[0:1], 0x24
+; CHECK-G-UNKNOWN-NEXT: s_lshr_b32 s2, ttmp7, 16
+; CHECK-G-UNKNOWN-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; CHECK-G-UNKNOWN-NEXT: v_dual_mov_b32 v1, 0 :: v_dual_mov_b32 v0, s2
+; CHECK-G-UNKNOWN-NEXT: s_wait_kmcnt 0x0
+; CHECK-G-UNKNOWN-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-G-UNKNOWN-NEXT: s_endpgm
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_X_EN: 1
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
+; CHECK-G-UNKNOWN: COMPUTE_PGM_RSRC2:TGID_Z_EN: 1
+;
+; CHECK-G-MESA3D-LABEL: test_cluster_id_z:
+; CHECK-G-MESA3D: .amd_kernel_code_t
+; CHECK-G-MESA3D-NEXT: amd_code_version_major = 1
+; CHECK-G-MESA3D-NEXT: amd_code_version_minor = 2
+; CHECK-G-MESA3D-NEXT: amd_machine_kind = 1
+; CHECK-G-MESA3D-NEXT: amd_machine_version_major = 12
+; CHECK-G-MESA3D-NEXT: amd_machine_version_minor = 5
+; CHECK-G-MESA3D-NEXT: amd_machine_version_stepping = 0
+; CHECK-G-MESA3D-NEXT: kernel_code_entry_byte_offset = 256
+; CHECK-G-MESA3D-NEXT: kernel_code_prefetch_byte_size = 0
+; CHECK-G-MESA3D-NEXT: granulated_workitem_vgpr_count = 0
+; CHECK-G-MESA3D-NEXT: granulated_wavefront_sgpr_count = 0
+; CHECK-G-MESA3D-NEXT: priority = 0
+; CHECK-G-MESA3D-NEXT: float_mode = 240
+; CHECK-G-MESA3D-NEXT: priv = 0
+; CHECK-G-MESA3D-NEXT: enable_dx10_clamp = 0
+; CHECK-G-MESA3D-NEXT: debug_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_ieee_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_wgp_mode = 0
+; CHECK-G-MESA3D-NEXT: enable_mem_ordered = 1
+; CHECK-G-MESA3D-NEXT: enable_fwd_progress = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_wave_byte_offset = 0
+; CHECK-G-MESA3D-NEXT: user_sgpr_count = 2
+; CHECK-G-MESA3D-NEXT: enable_trap_handler = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_x = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_y = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_id_z = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_workgroup_info = 0
+; CHECK-G-MESA3D-NEXT: enable_vgpr_workitem_id = 0
+; CHECK-G-MESA3D-NEXT: enable_exception_msb = 0
+; CHECK-G-MESA3D-NEXT: granulated_lds_size = 0
+; CHECK-G-MESA3D-NEXT: enable_exception = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_buffer = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_dispatch_ptr = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_queue_ptr = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_kernarg_segment_ptr = 1
+; CHECK-G-MESA3D-NEXT: enable_sgpr_dispatch_id = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_flat_scratch_init = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_private_segment_size = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_x = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_y = 0
+; CHECK-G-MESA3D-NEXT: enable_sgpr_grid_workgroup_count_z = 0
+; CHECK-G-MESA3D-NEXT: enable_wavefront_size32 = 1
+; CHECK-G-MESA3D-NEXT: enable_ordered_append_gds = 0
+; CHECK-G-MESA3D-NEXT: private_element_size = 1
+; CHECK-G-MESA3D-NEXT: is_ptr64 = 1
+; CHECK-G-MESA3D-NEXT: is_dynamic_callstack = 0
+; CHECK-G-MESA3D-NEXT: is_debug_enabled = 0
+; CHECK-G-MESA3D-NEXT: is_xnack_enabled = 0
+; CHECK-G-MESA3D-NEXT: workitem_private_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: workgroup_group_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: gds_segment_byte_size = 0
+; CHECK-G-MESA3D-NEXT: kernarg_segment_byte_size = 8
+; CHECK-G-MESA3D-NEXT: workgroup_fbarrier_count = 0
+; CHECK-G-MESA3D-NEXT: wavefront_sgpr_count = 3
+; CHECK-G-MESA3D-NEXT: workitem_vgpr_count = 2
+; CHECK-G-MESA3D-NEXT: reserved_vgpr_first = 0
+; CHECK-G-MESA3D-NEXT: reserved_vgpr_count = 0
+; CHECK-G-MESA3D-NEXT: reserved_sgpr_first = 0
+; CHECK-G-MESA3D-NEXT: reserved_sgpr_count = 0
+; CHECK-G-MESA3D-NEXT: debug_wavefront_private_segment_offset_sgpr = 0
+; CHECK-G-MESA3D-NEXT: debug_private_segment_buffer_sgpr = 0
+; CHECK-G-MESA3D-NEXT: kernarg_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: group_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: private_segment_alignment = 4
+; CHECK-G-MESA3D-NEXT: wavefront_size = 5
+; CHECK-G-MESA3D-NEXT: call_convention = -1
+; CHECK-G-MESA3D-NEXT: runtime_loader_kernel_symbol = 0
+; CHECK-G-MESA3D-NEXT: .end_amd_kernel_code_t
+; CHECK-G-MESA3D-NEXT: ; %bb.0:
+; CHECK-G-MESA3D-NEXT: s_load_b64 s[0:1], s[0:1], 0x0
+; CHECK-G-MESA3D-NEXT: s_lshr_b32 s2, ttmp7, 16
+; CHECK-G-MESA3D-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; CHECK-G-MESA3D-NEXT: v_dual_mov_b32 v1, 0 :: v_dual_mov_b32 v0, s2
+; CHECK-G-MESA3D-NEXT: s_wait_kmcnt 0x0
+; CHECK-G-MESA3D-NEXT: global_store_b32 v1, v0, s[0:1]
+; CHECK-G-MESA3D-NEXT: s_endpgm
+ %id = call i32 @llvm.amdgcn.cluster.id.z()
+ store i32 %id, ptr addrspace(1) %out
+ ret void
+}
diff --git a/llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll b/llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
index f4e5c276b8b75..33cd598aae9b5 100644
--- a/llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
+++ b/llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
@@ -265,6 +265,6 @@ declare float @llvm.fmuladd.f32(float, float, float) #1
attributes #0 = { nounwind willreturn "denormal-fp-math-f32"="preserve-sign,preserve-sign" }
attributes #1 = { nounwind readnone speculatable }
-attributes #2 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" "denormal-fp-math-f32"="preserve-sign,preserve-sign" }
+attributes #2 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" "denormal-fp-math-f32"="preserve-sign,preserve-sign" }
!0 = !{float 2.500000e+00}
diff --git a/llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll b/llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll
index cfe7315e20ff7..627f4ada95dba 100644
--- a/llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll
+++ b/llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll
@@ -406,7 +406,7 @@ bb.1:
declare i32 @llvm.amdgcn.workitem.id.x() #0
attributes #0 = { nounwind readnone speculatable }
-attributes #1 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" }
+attributes #1 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"amdhsa_code_object_version", i32 CODE_OBJECT_VERSION}
diff --git a/llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll b/llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll
index d65c6d950058e..0e886c594b370 100644
--- a/llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll
+++ b/llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll
@@ -104,4 +104,4 @@ define amdgpu_kernel void @partial_copy(<4 x i32> %arg) #0 {
declare <4 x i32> @llvm.amdgcn.mfma.i32.4x4x4i8(i32, i32, <4 x i32>, i32, i32, i32)
-attributes #0 = { nounwind "amdgpu-num-vgpr"="5" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-num-vgpr"="5" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll b/llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
index b1e05158b6212..83c521043025c 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
@@ -2688,4 +2688,4 @@ end:
}
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "target-cpu"="fiji" "uniform-work-group-size"="false" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "target-cpu"="fiji" "uniform-work-group-size"="false" }
diff --git a/llvm/test/CodeGen/AMDGPU/sibling-call.ll b/llvm/test/CodeGen/AMDGPU/sibling-call.ll
index 308d87ba79052..00214ef36e1f0 100644
--- a/llvm/test/CodeGen/AMDGPU/sibling-call.ll
+++ b/llvm/test/CodeGen/AMDGPU/sibling-call.ll
@@ -1072,4 +1072,4 @@ entry:
}
attributes #0 = { nounwind }
-attributes #1 = { nounwind noinline "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #1 = { nounwind noinline "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/spill-agpr.ll b/llvm/test/CodeGen/AMDGPU/spill-agpr.ll
index 290d9c5401154..da48af100d27b 100644
--- a/llvm/test/CodeGen/AMDGPU/spill-agpr.ll
+++ b/llvm/test/CodeGen/AMDGPU/spill-agpr.ll
@@ -363,11 +363,6 @@ use:
define amdgpu_kernel void @max_6regs_used_8a(ptr addrspace(1) %arg) #4 {
; GFX908-LABEL: max_6regs_used_8a:
; GFX908: ; %bb.0:
-; GFX908-NEXT: s_mov_b32 s4, SCRATCH_RSRC_DWORD0
-; GFX908-NEXT: s_mov_b32 s5, SCRATCH_RSRC_DWORD1
-; GFX908-NEXT: s_mov_b32 s6, -1
-; GFX908-NEXT: s_mov_b32 s7, 0xe00000
-; GFX908-NEXT: s_add_u32 s4, s4, s3
; GFX908-NEXT: s_load_dwordx2 s[2:3], s[0:1], 0x24
; GFX908-NEXT: ;;#ASMSTART
; GFX908-NEXT: ; def v1
@@ -376,22 +371,27 @@ define amdgpu_kernel void @max_6regs_used_8a(ptr addrspace(1) %arg) #4 {
; GFX908-NEXT: ;;#ASMSTART
; GFX908-NEXT: ; def a[0:3]
; GFX908-NEXT: ;;#ASMEND
-; GFX908-NEXT: s_addc_u32 s5, s5, 0
+; GFX908-NEXT: s_mov_b32 s8, SCRATCH_RSRC_DWORD0
; GFX908-NEXT: v_accvgpr_write_b32 a4, v1
; GFX908-NEXT: s_waitcnt lgkmcnt(0)
; GFX908-NEXT: global_load_dwordx4 v[0:3], v4, s[2:3]
+; GFX908-NEXT: s_mov_b32 s9, SCRATCH_RSRC_DWORD1
+; GFX908-NEXT: s_mov_b32 s10, -1
+; GFX908-NEXT: s_mov_b32 s11, 0xe00000
+; GFX908-NEXT: s_add_u32 s8, s8, s5
+; GFX908-NEXT: s_addc_u32 s9, s9, 0
; GFX908-NEXT: v_accvgpr_read_b32 v5, a0 ; Reload Reuse
; GFX908-NEXT: s_nop 1
-; GFX908-NEXT: buffer_store_dword v5, off, s[4:7], 0 ; 4-byte Folded Spill
+; GFX908-NEXT: buffer_store_dword v5, off, s[8:11], 0 ; 4-byte Folded Spill
; GFX908-NEXT: v_accvgpr_read_b32 v5, a1 ; Reload Reuse
; GFX908-NEXT: s_nop 1
-; GFX908-NEXT: buffer_store_dword v5, off, s[4:7], 0 offset:4 ; 4-byte Folded Spill
+; GFX908-NEXT: buffer_store_dword v5, off, s[8:11], 0 offset:4 ; 4-byte Folded Spill
; GFX908-NEXT: v_accvgpr_read_b32 v5, a2 ; Reload Reuse
; GFX908-NEXT: s_nop 1
-; GFX908-NEXT: buffer_store_dword v5, off, s[4:7], 0 offset:8 ; 4-byte Folded Spill
+; GFX908-NEXT: buffer_store_dword v5, off, s[8:11], 0 offset:8 ; 4-byte Folded Spill
; GFX908-NEXT: v_accvgpr_read_b32 v5, a3 ; Reload Reuse
; GFX908-NEXT: s_nop 1
-; GFX908-NEXT: buffer_store_dword v5, off, s[4:7], 0 offset:12 ; 4-byte Folded Spill
+; GFX908-NEXT: buffer_store_dword v5, off, s[8:11], 0 offset:12 ; 4-byte Folded Spill
; GFX908-NEXT: s_waitcnt vmcnt(4)
; GFX908-NEXT: v_accvgpr_write_b32 a0, v0
; GFX908-NEXT: v_accvgpr_write_b32 a1, v1
@@ -407,11 +407,11 @@ define amdgpu_kernel void @max_6regs_used_8a(ptr addrspace(1) %arg) #4 {
; GFX908-NEXT: v_accvgpr_read_b32 v3, a3
; GFX908-NEXT: s_nop 1
; GFX908-NEXT: global_store_dwordx4 v4, v[0:3], s[2:3]
-; GFX908-NEXT: buffer_load_dword v0, off, s[4:7], 0 ; 4-byte Folded Reload
+; GFX908-NEXT: buffer_load_dword v0, off, s[8:11], 0 ; 4-byte Folded Reload
; GFX908-NEXT: s_nop 0
-; GFX908-NEXT: buffer_load_dword v1, off, s[4:7], 0 offset:4 ; 4-byte Folded Reload
-; GFX908-NEXT: buffer_load_dword v2, off, s[4:7], 0 offset:8 ; 4-byte Folded Reload
-; GFX908-NEXT: buffer_load_dword v3, off, s[4:7], 0 offset:12 ; 4-byte Folded Reload
+; GFX908-NEXT: buffer_load_dword v1, off, s[8:11], 0 offset:4 ; 4-byte Folded Reload
+; GFX908-NEXT: buffer_load_dword v2, off, s[8:11], 0 offset:8 ; 4-byte Folded Reload
+; GFX908-NEXT: buffer_load_dword v3, off, s[8:11], 0 offset:12 ; 4-byte Folded Reload
; GFX908-NEXT: s_waitcnt vmcnt(0)
; GFX908-NEXT: global_store_dwordx4 v[0:1], v[0:3], off
; GFX908-NEXT: s_waitcnt vmcnt(0)
@@ -423,24 +423,24 @@ define amdgpu_kernel void @max_6regs_used_8a(ptr addrspace(1) %arg) #4 {
;
; GFX90A-LABEL: max_6regs_used_8a:
; GFX90A: ; %bb.0:
-; GFX90A-NEXT: s_mov_b32 s4, SCRATCH_RSRC_DWORD0
-; GFX90A-NEXT: s_mov_b32 s5, SCRATCH_RSRC_DWORD1
-; GFX90A-NEXT: s_mov_b32 s6, -1
-; GFX90A-NEXT: s_mov_b32 s7, 0xe00000
-; GFX90A-NEXT: s_add_u32 s4, s4, s3
+; GFX90A-NEXT: s_mov_b32 s8, SCRATCH_RSRC_DWORD0
+; GFX90A-NEXT: s_mov_b32 s9, SCRATCH_RSRC_DWORD1
+; GFX90A-NEXT: s_mov_b32 s10, -1
; GFX90A-NEXT: s_load_dwordx2 s[2:3], s[0:1], 0x24
-; GFX90A-NEXT: s_addc_u32 s5, s5, 0
+; GFX90A-NEXT: s_mov_b32 s11, 0xe00000
+; GFX90A-NEXT: s_add_u32 s8, s8, s5
+; GFX90A-NEXT: s_addc_u32 s9, s9, 0
; GFX90A-NEXT: ;;#ASMSTART
; GFX90A-NEXT: ; def v1
; GFX90A-NEXT: ;;#ASMEND
; GFX90A-NEXT: ;;#ASMSTART
; GFX90A-NEXT: ; def a[0:3]
; GFX90A-NEXT: ;;#ASMEND
-; GFX90A-NEXT: buffer_store_dword a0, off, s[4:7], 0 ; 4-byte Folded Spill
+; GFX90A-NEXT: buffer_store_dword a0, off, s[8:11], 0 ; 4-byte Folded Spill
; GFX90A-NEXT: s_nop 0
-; GFX90A-NEXT: buffer_store_dword a1, off, s[4:7], 0 offset:4 ; 4-byte Folded Spill
-; GFX90A-NEXT: buffer_store_dword a2, off, s[4:7], 0 offset:8 ; 4-byte Folded Spill
-; GFX90A-NEXT: buffer_store_dword a3, off, s[4:7], 0 offset:12 ; 4-byte Folded Spill
+; GFX90A-NEXT: buffer_store_dword a1, off, s[8:11], 0 offset:4 ; 4-byte Folded Spill
+; GFX90A-NEXT: buffer_store_dword a2, off, s[8:11], 0 offset:8 ; 4-byte Folded Spill
+; GFX90A-NEXT: buffer_store_dword a3, off, s[8:11], 0 offset:12 ; 4-byte Folded Spill
; GFX90A-NEXT: v_lshlrev_b32_e32 v0, 4, v0
; GFX90A-NEXT: s_waitcnt lgkmcnt(0)
; GFX90A-NEXT: global_load_dwordx4 a[0:3], v0, s[2:3]
@@ -450,10 +450,10 @@ define amdgpu_kernel void @max_6regs_used_8a(ptr addrspace(1) %arg) #4 {
; GFX90A-NEXT: v_mfma_f32_4x4x1f32 a[0:3], v2, v2, a[0:3]
; GFX90A-NEXT: s_nop 4
; GFX90A-NEXT: global_store_dwordx4 v0, a[0:3], s[2:3]
-; GFX90A-NEXT: buffer_load_dword v2, off, s[4:7], 0 ; 4-byte Folded Reload
-; GFX90A-NEXT: buffer_load_dword v3, off, s[4:7], 0 offset:4 ; 4-byte Folded Reload
-; GFX90A-NEXT: buffer_load_dword v4, off, s[4:7], 0 offset:8 ; 4-byte Folded Reload
-; GFX90A-NEXT: buffer_load_dword v5, off, s[4:7], 0 offset:12 ; 4-byte Folded Reload
+; GFX90A-NEXT: buffer_load_dword v2, off, s[8:11], 0 ; 4-byte Folded Reload
+; GFX90A-NEXT: buffer_load_dword v3, off, s[8:11], 0 offset:4 ; 4-byte Folded Reload
+; GFX90A-NEXT: buffer_load_dword v4, off, s[8:11], 0 offset:8 ; 4-byte Folded Reload
+; GFX90A-NEXT: buffer_load_dword v5, off, s[8:11], 0 offset:12 ; 4-byte Folded Reload
; GFX90A-NEXT: s_waitcnt vmcnt(0)
; GFX90A-NEXT: global_store_dwordx4 v[0:1], v[2:5], off
; GFX90A-NEXT: s_waitcnt vmcnt(0)
@@ -479,6 +479,12 @@ declare <16 x float> @llvm.amdgcn.mfma.f32.16x16x1f32(float, float, <16 x float>
declare <4 x float> @llvm.amdgcn.mfma.f32.4x4x1f32(float, float, <4 x float>, i32, i32, i32)
declare <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float, float, <32 x float>, i32, i32, i32)
+
+attributes #1 = { nounwind "amdgpu-num-vgpr"="10" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #2 = { nounwind "amdgpu-num-vgpr"="12" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #3 = { nounwind "amdgpu-num-vgpr"="32" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #4 = { nounwind "amdgpu-num-vgpr"="6" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+
attributes #1 = { nounwind "amdgpu-num-vgpr"="10" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #2 = { nounwind "amdgpu-num-vgpr"="12" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #3 = { nounwind "amdgpu-num-vgpr"="32" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll b/llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
index 477297ba2e7d5..c84c49ee1a41d 100644
--- a/llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
+++ b/llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
@@ -157,4 +157,4 @@ shader_eval_surface.exit: ; preds = %entry
declare hidden i32 @svm_eval_nodes(ptr addrspace(5), ptr addrspace(5), ptr addrspace(5), i32, i32) local_unnamed_addr #0
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-any.ll b/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-any.ll
index 30accc846d2b6..c561e32d2db72 100644
--- a/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-any.ll
+++ b/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-any.ll
@@ -23,7 +23,7 @@ entry:
ret void
}
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"amdhsa_code_object_version", i32 400}
diff --git a/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-off.ll b/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-off.ll
index 4f84b31f1877b..2922424704edc 100644
--- a/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-off.ll
+++ b/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-off.ll
@@ -23,7 +23,7 @@ entry:
ret void
}
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"amdhsa_code_object_version", i32 400}
diff --git a/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-on.ll b/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-on.ll
index 644f434923368..aedb5f9106ec8 100644
--- a/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-on.ll
+++ b/llvm/test/CodeGen/AMDGPU/tid-kd-xnack-on.ll
@@ -23,7 +23,7 @@ entry:
ret void
}
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"amdhsa_code_object_version", i32 400}
diff --git a/llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll b/llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll
index 929db4c9be1c7..ed8bc9ca700a8 100644
--- a/llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll
+++ b/llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll
@@ -540,7 +540,7 @@
; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare align 4 ptr addrspace(4) @llvm.amdgcn.kernarg.segment.ptr() #2
-attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #1 = { nounwind }
attributes #2 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
!0 = !{}
diff --git a/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll b/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll
index f054bea1f2780..68c3d1b2f2972 100644
--- a/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll
+++ b/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll
@@ -63,4 +63,4 @@ define amdgpu_kernel void @scavenge_fi(ptr addrspace(1) %out, i32 %in) #0 {
ret void
}
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
diff --git a/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll b/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll
index 924216efcc461..55598ec70d953 100644
--- a/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll
+++ b/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll
@@ -99,7 +99,7 @@
; Function Attrs: convergent nocallback nofree nounwind willreturn
declare void @llvm.amdgcn.end.cf.i64(i64) #2
- attributes #0 = { "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-flat-scratch-init" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" }
+ attributes #0 = { "amdgpu-no-completion-action" "amdgpu-no-default-queue" "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-flat-scratch-init" "amdgpu-no-heap-ptr" "amdgpu-no-hostcall-ptr" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-multigrid-sync-arg" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" "uniform-work-group-size"="false" }
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
attributes #2 = { convergent nocallback nofree nounwind willreturn }
attributes #3 = { convergent nocallback nofree nounwind willreturn memory(none) }
diff --git a/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll b/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll
index 39f1ddd0609d8..2326b2dc09b58 100644
--- a/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll
+++ b/llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll
@@ -73,5 +73,5 @@ bb4:
ret void
}
-attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-flat-scratch-init" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
+attributes #0 = { nounwind "amdgpu-no-dispatch-id" "amdgpu-no-dispatch-ptr" "amdgpu-no-flat-scratch-init" "amdgpu-no-implicitarg-ptr" "amdgpu-no-lds-kernel-id" "amdgpu-no-queue-ptr" "amdgpu-no-workgroup-id-x" "amdgpu-no-cluster-id-x" "amdgpu-no-workgroup-id-y" "amdgpu-no-cluster-id-y" "amdgpu-no-workgroup-id-z" "amdgpu-no-cluster-id-z" "amdgpu-no-workitem-id-x" "amdgpu-no-workitem-id-y" "amdgpu-no-workitem-id-z" }
attributes #1 = { nounwind readnone }
More information about the llvm-commits
mailing list