[Openmp-commits] [openmp] [OpenMP][libomptarget] Enable parallel copies via multiple SDMA engines (PR #71801)
Jan Patrick Lehr via Openmp-commits
openmp-commits at lists.llvm.org
Fri Nov 10 14:28:45 PST 2023
================
@@ -130,6 +130,38 @@ Error iterateAgentMemoryPools(hsa_agent_t Agent, CallbackTy Cb) {
"Error in hsa_amd_agent_iterate_memory_pools: %s");
}
+/// Dispatches an asynchronous memory copy
+/// Enables different SDMA engines for the dispatch in a round-robin fashion.
+Error asyncMemCopy(bool UseMultipleSdmaEngines, void *Dst, hsa_agent_t DstAgent,
+ const void *Src, hsa_agent_t SrcAgent, size_t Size,
+ uint32_t NumDepSignals, const hsa_signal_t *DepSignals,
+ hsa_signal_t CompletionSignal) {
+ if (UseMultipleSdmaEngines) {
+ hsa_status_t S =
+ hsa_amd_memory_async_copy(Dst, DstAgent, Src, SrcAgent, Size,
+ NumDepSignals, DepSignals, CompletionSignal);
+ return Plugin::check(S, "Error in hsa_amd_memory_async_copy");
+ }
+
+// This solution is probably not the best
+#if !(HSA_AMD_INTERFACE_VERSION_MAJOR >= 1 && \
+ HSA_AMD_INTERFACE_VERSION_MINOR >= 2)
+ return Plugin::error("Async copy on selected SDMA requires ROCm 5.7");
+#else
+ static int SdmaEngine = 1;
----------------
jplehr wrote:
So, I looked into it and we do have one code path that is not under a lock. I made this an atomic and do RMW.
With this solution we could still run into a scenario where two threads read the same value and dispatch to the same SDMA engine. While not desirable, it's not a correctness issue, and I think the probability is quite low.
https://github.com/llvm/llvm-project/pull/71801
More information about the Openmp-commits
mailing list