[llvm] [Offload] Use the kernel argument size directly in AMDGPU offloading (PR #94667)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 6 12:30:32 PDT 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-offload
Author: Joseph Huber (jhuber6)
<details>
<summary>Changes</summary>
Summary:
The old COV3 implementation of HSA used to omit the implicit arguments
from the kernel argument size. For COV4 and COV5 this is no longer the
case so we can simply use the size reported from the symbol information.
---
Full diff: https://github.com/llvm/llvm-project/pull/94667.diff
1 Files Affected:
- (modified) offload/plugins-nextgen/amdgpu/src/rtl.cpp (+1-7)
``````````diff
diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index f088d5d1685df..663cfdc5fdf01 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -3272,19 +3272,13 @@ Error AMDGPUKernelTy::launchImpl(GenericDeviceTy &GenericDevice,
if (ArgsSize < KernelArgsSize)
return Plugin::error("Mismatch of kernel arguments size");
- // The args size reported by HSA may or may not contain the implicit args.
- // For now, assume that HSA does not consider the implicit arguments when
- // reporting the arguments of a kernel. In the worst case, we can waste
- // 56 bytes per allocation.
- uint32_t AllArgsSize = KernelArgsSize + ImplicitArgsSize;
-
AMDGPUPluginTy &AMDGPUPlugin =
static_cast<AMDGPUPluginTy &>(GenericDevice.Plugin);
AMDHostDeviceTy &HostDevice = AMDGPUPlugin.getHostDevice();
AMDGPUMemoryManagerTy &ArgsMemoryManager = HostDevice.getArgsMemoryManager();
void *AllArgs = nullptr;
- if (auto Err = ArgsMemoryManager.allocate(AllArgsSize, &AllArgs))
+ if (auto Err = ArgsMemoryManager.allocate(ArgsSize, &AllArgs))
return Err;
// Account for user requested dynamic shared memory.
``````````
</details>
https://github.com/llvm/llvm-project/pull/94667
More information about the llvm-commits
mailing list