[clang] [AMDGPU] fix amdgpu_max_num_work_groups in templates (PR #141633)

Matt Arsenault via cfe-commits cfe-commits at lists.llvm.org
Tue May 27 10:16:22 PDT 2025


================
@@ -78,6 +78,12 @@ __global__ void template_32_4_a_max_num_work_groups() {}
 template __global__ void template_32_4_a_max_num_work_groups<2>();
 // CHECK: define{{.*}} amdgpu_kernel void @_Z35template_32_4_a_max_num_work_groupsILj2EEvv() [[MAX_NUM_WORK_GROUPS_32_4_2:#[0-9]+]]
 
+template<unsigned a>
+__attribute__((amdgpu_max_num_work_groups(a)))
+__global__ void template_a_max_num_work_groups() {}
+template __global__ void template_a_max_num_work_groups<32>();
----------------
arsenm wrote:

Need additional cases stressing 2 and 3 entries

https://github.com/llvm/llvm-project/pull/141633


More information about the cfe-commits mailing list