[Openmp-commits] [openmp] [OpenMP][AMDGPU] Adapt dynamic callstack sizes to HIP behavior (PR #74080)
Joseph Huber via Openmp-commits
openmp-commits at lists.llvm.org
Fri Dec 1 08:38:53 PST 2023
================
@@ -1872,6 +1873,25 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
else
return Plugin::error("Unexpected AMDGPU wavefront %d", WavefrontSize);
+ // To determine the correct scratch memory size per thread, we need to check
+ // the device architecure generation. According to AOT_OFFLOADARCHS we may
+ // assume that AMDGPU offload archs are prefixed with "gfx" and suffixed
+ // with a two char arch specialization. In-between is the 1-2 char
+ // generation number we want to extract.
+ StringRef Arch(ComputeUnitKind);
+ unsigned GfxGen = 0u;
+ if (!llvm::to_integer(Arch.slice(sizeof("gfx") - 1, Arch.size() - 2),
+ GfxGen))
+ return Plugin::error("Invalid GFX architecture string");
+
+ // See: 'getMaxWaveScratchSize' in 'llvm/lib/Target/AMDGPU/GCNSubtarget.h'.
+ // But we need to divide by WavefrontSize.
+ // For generations pre-gfx11: use 13-bit field in units of 256-dword,
+ // otherwise: 15-bit field in units of 64-dword.
+ MaxThreadScratchSize = (GfxGen < 11)
----------------
jhuber6 wrote:
if you get `gfx1100` the `GfxGen` will be `1100` here.
Also I don't know if I like `GfxGen` here. It's called either "Arch" or `ISAVersion` elsewhere that I know of.
https://github.com/llvm/llvm-project/pull/74080
More information about the Openmp-commits
mailing list