[clang] [AMDGPU] fix amdgpu_max_num_work_groups in templates (PR #141633)
Matt Arsenault via cfe-commits
cfe-commits at lists.llvm.org
Tue May 27 10:16:22 PDT 2025
================
@@ -78,6 +78,12 @@ __global__ void template_32_4_a_max_num_work_groups() {}
template __global__ void template_32_4_a_max_num_work_groups<2>();
// CHECK: define{{.*}} amdgpu_kernel void @_Z35template_32_4_a_max_num_work_groupsILj2EEvv() [[MAX_NUM_WORK_GROUPS_32_4_2:#[0-9]+]]
+template<unsigned a>
+__attribute__((amdgpu_max_num_work_groups(a)))
+__global__ void template_a_max_num_work_groups() {}
+template __global__ void template_a_max_num_work_groups<32>();
----------------
arsenm wrote:
Need additional cases stressing 2 and 3 entries
https://github.com/llvm/llvm-project/pull/141633
More information about the cfe-commits
mailing list