[Openmp-commits] [openmp] [amdgpu] Implement D2D memcpy as HSA call (PR #69955)

Jon Chesterfield via Openmp-commits openmp-commits at lists.llvm.org
Mon Oct 23 13:03:09 PDT 2023

@@ -3174,9 +3197,11 @@ void *AMDGPUDeviceTy::allocate(size_t Size, void *, TargetAllocTy Kind) {
     return nullptr;
-  if (Alloc && (Kind == TARGET_ALLOC_HOST || Kind == TARGET_ALLOC_SHARED)) {
+  if (Alloc) {
     auto &KernelAgents = Plugin::get<AMDGPUPluginTy>().getKernelAgents();
+    // Inherently necessary for host or shared allocations
+    // Also enabled for device memory to allow device to device memcpy
JonChesterfield wrote:

Unknown. I'd guess it's cheap if there's only one GPU in the system and has non-zero cost if there are multiple GPUs that do not perform D2D copies. However the alternative is to try to work backwards from the void* to a corresponding memory pool using maps and that's definitely non-zero cost as well.


More information about the Openmp-commits mailing list