[Openmp-commits] [openmp] [OpenMP][AMDGPU] Adapt dynamic callstack sizes to HIP behavior (PR #74080)

Michael Halkenhäuser via Openmp-commits openmp-commits at lists.llvm.org
Fri Dec 1 07:22:03 PST 2023

@@ -1872,6 +1873,38 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
       return Plugin::error("Unexpected AMDGPU wavefront %d", WavefrontSize);
+    // To determine the correct scratch memory size per thread, we need to check
+    // the device architecure generation. According to AOT_OFFLOADARCHS we may
+    // assume that AMDGPU offload archs are prefixed with "gfx" and suffixed
+    // with a two char arch specialization. In-between is the 1-2 char
+    // generation number we want to extract.
+    std::string CUKind{ComputeUnitKind};
mhalk wrote:

Yes we only want the major version, so 9 for `gfx90a` or 10 for `gfx1030`.
It is pretty complicated for what it actually does. With the C++ tools I was aware of, I wanted to make sure I do not produce garbage.
Thank you very much!


More information about the Openmp-commits mailing list