[Openmp-commits] [openmp] [OpenMP][libomptarget] Enable parallel copies via multiple SDMA engines (PR #71801)

Jan Patrick Lehr via Openmp-commits openmp-commits at lists.llvm.org
Thu Nov 9 14:06:13 PST 2023


================
@@ -130,6 +130,38 @@ Error iterateAgentMemoryPools(hsa_agent_t Agent, CallbackTy Cb) {
                        "Error in hsa_amd_agent_iterate_memory_pools: %s");
 }
 
+/// Dispatches an asynchronous memory copy
+/// Enables different SDMA engines for the dispatch in a round-robin fashion.
+Error asyncMemCopy(bool UseMultipleSdmaEngines, void *Dst, hsa_agent_t DstAgent,
+                   const void *Src, hsa_agent_t SrcAgent, size_t Size,
+                   uint32_t NumDepSignals, const hsa_signal_t *DepSignals,
+                   hsa_signal_t CompletionSignal) {
+  if (UseMultipleSdmaEngines) {
+    hsa_status_t S =
+        hsa_amd_memory_async_copy(Dst, DstAgent, Src, SrcAgent, Size,
+                                  NumDepSignals, DepSignals, CompletionSignal);
+    return Plugin::check(S, "Error in hsa_amd_memory_async_copy");
+  }
+
+// This solution is probably not the best
+#if !(HSA_AMD_INTERFACE_VERSION_MAJOR >= 1 &&                                  \
+      HSA_AMD_INTERFACE_VERSION_MINOR >= 2)
+  return Plugin::error("Async copy on selected SDMA requires ROCm 5.7");
+#else
+  static int SdmaEngine = 1;
----------------
jplehr wrote:

That's a good point. In most instances, this function is called from a region that is protected via a `lock_guard`, so IIRC for those cases this is protected and there should not be two threads here at the same time.
But I have to look more into one case, where this may race.

https://github.com/llvm/llvm-project/pull/71801


More information about the Openmp-commits mailing list