[llvm] [Offload] Stop the RPC server faiilng with more than one GPU (PR #125982)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 5 18:49:38 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-offload
Author: Joseph Huber (jhuber6)
<details>
<summary>Changes</summary>
Summary:
Pretty dumb mistake of me, forgot that this is run per-device and
per-plugin, which fell through the cracks with my testing because I have
two GPUs that use different plugins.
---
Full diff: https://github.com/llvm/llvm-project/pull/125982.diff
1 Files Affected:
- (modified) offload/plugins-nextgen/common/src/PluginInterface.cpp (+4-3)
``````````diff
diff --git a/offload/plugins-nextgen/common/src/PluginInterface.cpp b/offload/plugins-nextgen/common/src/PluginInterface.cpp
index d2451d8a3422121..76ae0a2dd9c4523 100644
--- a/offload/plugins-nextgen/common/src/PluginInterface.cpp
+++ b/offload/plugins-nextgen/common/src/PluginInterface.cpp
@@ -1058,8 +1058,9 @@ Error GenericDeviceTy::setupRPCServer(GenericPluginTy &Plugin,
if (auto Err = Server.initDevice(*this, Plugin.getGlobalHandler(), Image))
return Err;
- if (auto Err = Server.startThread())
- return Err;
+ if (!Server.Thread->Running.load(std::memory_order_acquire))
+ if (auto Err = Server.startThread())
+ return Err;
RPCServer = &Server;
DP("Running an RPC server on device %d\n", getDeviceId());
@@ -1634,7 +1635,7 @@ Error GenericPluginTy::deinit() {
if (GlobalHandler)
delete GlobalHandler;
- if (RPCServer && RPCServer->Thread->Running.load(std::memory_order_relaxed))
+ if (RPCServer && RPCServer->Thread->Running.load(std::memory_order_acquire))
if (Error Err = RPCServer->shutDown())
return Err;
``````````
</details>
https://github.com/llvm/llvm-project/pull/125982
More information about the llvm-commits
mailing list