[Openmp-commits] [PATCH] D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps

Jonas Hahnfeld via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Sat Jul 20 01:22:09 PDT 2019


Hahnfeld requested changes to this revision.
Hahnfeld added a comment.
This revision now requires changes to proceed.

Please add tests at least

- for the API, ie that `omp_target_alloc` returns distinct memory (I suggest to `omp_target_memcpy` from host to allocated memory to a different host memory and change the first host memory before copying back).
- that memory regions are indeed shared (in a `target data` region, the host should see updates from the host without an update and vice versa)



================
Comment at: libomptarget/plugins/cuda/src/rtl.cpp:590-599
+  // Any invocation of this method with a valid host pointer will
+  // result in the host version of this variable being used.
+  // If a host pointer is not present then the intent of this
+  // call was to allocate data on the device even if unified
+  // memory may have been activated.
+  if ((DeviceInfo.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) &&
+      hst_ptr) {
----------------
Why do we need to handle this in the plugin? As I said in my previous comment, such things can be handled in the generic part of libomptarget.


================
Comment at: libomptarget/plugins/cuda/src/rtl.cpp:622-629
+  // No need to actually copy any data to device. The data is available
+  // on the host and can be accessed by the device via unified memory.
+  // When host and device pointers are equal it means that this is called
+  // internally by the runtime and not directly by the user via the API.
+  if (DeviceInfo.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY &&
+      tgt_ptr == hst_ptr)
+    return OFFLOAD_SUCCESS;
----------------
Likewise.


================
Comment at: libomptarget/plugins/cuda/src/rtl.cpp:651-656
+  // Act as if a copy to device was successful in the case of
+  // unified memory where only a host version of the data exists.
+  if (DeviceInfo.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY &&
+      hst_ptr == tgt_ptr)
+    return OFFLOAD_SUCCESS;
+
----------------
Likewise.


================
Comment at: libomptarget/src/api.cpp:118-123
+  // Under unified memory the host pointer can be returned by the
+  // getTgtPtrBegin() function which means that there is no device
+  // corresponding point for ptr. This function should return false
+  // in that situation.
+  if (Device.RTLRequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY)
+    rc = (TgtPtr != ptr);
----------------
The spec says `This routine returns non-zero if the specified pointer would be found present on device device_num by a map clause; otherwise, it returns zero.` Programs with `unified_shared_memory` could imply that all pointers are present on all devices, so I would have expected the routine to always return true. What's the reasoning for checking that the pointers differ?


================
Comment at: libomptarget/src/device.cpp:166
 
-  // Check if the pointer is contained.
-  if (lr.Flags.IsContained ||
-      ((lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter) && IsImplicit)) {
+  // If unified memory is active implicitly mapped variables that are not
+  // privatized, use host address. Any explicitely mapped variables also use
----------------
unified shared memory


================
Comment at: libomptarget/src/device.cpp:207-215
+      uintptr_t tp;
+      // If unified memory is active AND a use_device_ptr clause was used,
+      // we will force the allocation of a device variable. It means that the
+      // user really wants to have this variable replicate on the device.
+      if (RTLRequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY &&
+          IsInUseDevicePtrClause)
+        tp = (uintptr_t)RTL->data_alloc(RTLDeviceID, Size, NULL);
----------------
I don't understand this comment and behavior: `use_device_ptr` is a clause for `target data` and IIRC it means that the pointer is replaced by the device pointer in the data region. Why does this imply we have to allocate memory? I'd expect to just keep the host pointer because all memory of all devices is accessible from the host.

Can you maybe add a test for the expected behavior?


================
Comment at: libomptarget/src/device.cpp:208
+      uintptr_t tp;
+      // If unified memory is active AND a use_device_ptr clause was used,
+      // we will force the allocation of a device variable. It means that the
----------------
unified shared memory


================
Comment at: libomptarget/src/device.cpp:290-291
 int DeviceTy::deallocTgtPtr(void *HstPtrBegin, int64_t Size, bool ForceDelete) {
+  if (RTLRequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY)
+    return OFFLOAD_SUCCESS;
   // Check if the pointer is contained in any sub-nodes.
----------------
If we can still alloc memory in `getOrAllocTgtPtr`, then this is wrong.


Repository:
  rOMP OpenMP

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65001/new/

https://reviews.llvm.org/D65001





More information about the Openmp-commits mailing list