[clang] [clang][CodeGen][SPIRV] Translate `amdgpu_flat_work_group_size` into `reqd_work_group_size`. (PR #116820)

Thu Dec 19 01:55:50 PST 2024

================
@@ -245,6 +247,41 @@ SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace(CodeGenModule &CGM,
   return DefaultGlobalAS;
 }
 
+void SPIRVTargetCodeGenInfo::setTargetAttributes(
+    const Decl *D, llvm::GlobalValue *GV, CodeGen::CodeGenModule &M) const {
+  if (!M.getLangOpts().HIP ||
+      M.getTarget().getTriple().getVendor() != llvm::Triple::AMD)
+    return;
+  if (GV->isDeclaration())
+    return;
+
+  auto F = dyn_cast<llvm::Function>(GV);
+  if (!F)
+    return;
+
+  auto FD = dyn_cast_or_null<FunctionDecl>(D);
+  if (!FD)
+    return;
+  if (!FD->hasAttr<CUDAGlobalAttr>())
+    return;
+
+  unsigned N = M.getLangOpts().GPUMaxThreadsPerBlock;
+  if (auto FlatWGS = FD->getAttr<AMDGPUFlatWorkGroupSizeAttr>())
+    N = FlatWGS->getMax()->EvaluateKnownConstInt(M.getContext()).getExtValue();
+
+  // We encode the maximum flat WG size in the first component of the 3D
+  // max_work_group_size attribute, which will get reverse translated into the
+  // original AMDGPU attribute when targeting AMDGPU.
----------------
AlexVlx wrote:

I think this is OK _in HIP_ because the language (only) defines `__launch_bounds__`, which is 1D, and we implement with the AMDGPU attribute. At the same time, the SPIR-V attribute cannot be produced via other defined means in HIP (there's no Clang `__attribute__` for it, for example, so the user couldn't have written some N-dimensional `max_work_group_size` themselves), so its presence in AMDGCN flavoured SPIR-V is fairly unambiguously originating from here.

In general we will eventually replace this with processing for all AMDGPU attributes, but that has some challenges in that it'd be more infectious in the translator (or the BE and any eventual SPIR-V consumer, if they were to manifest). Conversely we cannot just drop the original attribute on the floor as correctness depends on it. Hence the PR.

https://github.com/llvm/llvm-project/pull/116820