[llvm] [Offload] Stop the RPC server faiilng with more than one GPU (PR #125982)

via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 5 18:49:38 PST 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-offload

Author: Joseph Huber (jhuber6)

<details>
<summary>Changes</summary>

Summary:
Pretty dumb mistake of me, forgot that this is run per-device and
per-plugin, which fell through the cracks with my testing because I have
two GPUs that use different plugins.


---
Full diff: https://github.com/llvm/llvm-project/pull/125982.diff


1 Files Affected:

- (modified) offload/plugins-nextgen/common/src/PluginInterface.cpp (+4-3) 


``````````diff
diff --git a/offload/plugins-nextgen/common/src/PluginInterface.cpp b/offload/plugins-nextgen/common/src/PluginInterface.cpp
index d2451d8a3422121..76ae0a2dd9c4523 100644
--- a/offload/plugins-nextgen/common/src/PluginInterface.cpp
+++ b/offload/plugins-nextgen/common/src/PluginInterface.cpp
@@ -1058,8 +1058,9 @@ Error GenericDeviceTy::setupRPCServer(GenericPluginTy &Plugin,
   if (auto Err = Server.initDevice(*this, Plugin.getGlobalHandler(), Image))
     return Err;
 
-  if (auto Err = Server.startThread())
-    return Err;
+  if (!Server.Thread->Running.load(std::memory_order_acquire))
+    if (auto Err = Server.startThread())
+      return Err;
 
   RPCServer = &Server;
   DP("Running an RPC server on device %d\n", getDeviceId());
@@ -1634,7 +1635,7 @@ Error GenericPluginTy::deinit() {
   if (GlobalHandler)
     delete GlobalHandler;
 
-  if (RPCServer && RPCServer->Thread->Running.load(std::memory_order_relaxed))
+  if (RPCServer && RPCServer->Thread->Running.load(std::memory_order_acquire))
     if (Error Err = RPCServer->shutDown())
       return Err;
 

``````````

</details>


https://github.com/llvm/llvm-project/pull/125982


More information about the llvm-commits mailing list