[libc-commits] [libc] 182e5ac - [libc] Check the RPC server once again after the kernel exits
Joseph Huber via libc-commits
libc-commits at lists.llvm.org
Fri May 12 10:49:26 PDT 2023
Author: Joseph Huber
Date: 2023-05-12T12:49:19-05:00
New Revision: 182e5acb1172fe4c3effe518d2dac3bc3972dd09
URL: https://github.com/llvm/llvm-project/commit/182e5acb1172fe4c3effe518d2dac3bc3972dd09
DIFF: https://github.com/llvm/llvm-project/commit/182e5acb1172fe4c3effe518d2dac3bc3972dd09.diff
LOG: [libc] Check the RPC server once again after the kernel exits
We support asynchronous sends, that means that the kernel can issue a
send, then exit the kernel as we do with the `EXIT` syscall. Because of
the condition it's therefore possible for the kernel to exit and break
from the loop before we check the server again. This can potentially
cause us to ignore an `EXIT` call from the GPU.
Reviewed By: JonChesterfield, lntue
Differential Revision: https://reviews.llvm.org/D150456
Added:
Modified:
libc/utils/gpu/loader/amdgpu/Loader.cpp
libc/utils/gpu/loader/nvptx/Loader.cpp
Removed:
################################################################################
diff --git a/libc/utils/gpu/loader/amdgpu/Loader.cpp b/libc/utils/gpu/loader/amdgpu/Loader.cpp
index eab3d6a000794..a98b557b877c4 100644
--- a/libc/utils/gpu/loader/amdgpu/Loader.cpp
+++ b/libc/utils/gpu/loader/amdgpu/Loader.cpp
@@ -221,6 +221,10 @@ hsa_status_t launch_kernel(hsa_agent_t dev_agent, hsa_executable_t executable,
/*timeout_hint=*/1024, HSA_WAIT_STATE_ACTIVE) != 0)
handle_server();
+ // Handle the server one more time in case the kernel exited with a pending
+ // send still in flight.
+ handle_server();
+
// Destroy the resources acquired to launch the kernel and return.
if (hsa_status_t err = hsa_amd_memory_pool_free(args))
handle_error(err);
diff --git a/libc/utils/gpu/loader/nvptx/Loader.cpp b/libc/utils/gpu/loader/nvptx/Loader.cpp
index fc30274163dc3..7879deea65a0a 100644
--- a/libc/utils/gpu/loader/nvptx/Loader.cpp
+++ b/libc/utils/gpu/loader/nvptx/Loader.cpp
@@ -186,6 +186,10 @@ CUresult launch_kernel(CUmodule binary, CUstream stream,
while (cuStreamQuery(stream) == CUDA_ERROR_NOT_READY)
handle_server();
+ // Handle the server one more time in case the kernel exited with a pending
+ // send still in flight.
+ handle_server();
+
return CUDA_SUCCESS;
}
More information about the libc-commits
mailing list