[Openmp-commits] [PATCH] D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps
Alexandre Eichenberger via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Mon Jul 29 14:36:38 PDT 2019
AlexEichenberger added inline comments.
Comment at: libomptarget/src/omptarget.cpp:247-249
+ // TODO: Check if this is correct
+ bool IsInUseDevicePtrClause = arg_types[i] & OMP_TGT_MAPTYPE_TARGET_PARAM &&
+ arg_types[i] & OMP_TGT_MAPTYPE_RETURN_PARAM;
> gtbercea wrote:
> > Hahnfeld wrote:
> > > gtbercea wrote:
> > > > Hahnfeld wrote:
> > > > > grokos wrote:
> > > > > > This is correct, with one little exception. Although the OpenMP standard does not mandate it, upstream clang supports `use_device_ptr` on pointers which are struct members. Because they are struct members, they are not marked with `TARGET_PARAM` (only the combined entry is considered a target parameter, not the individual members). On the other hand, they are marked with `PTR_AND_OBJ` and have some value in the `MEMBER_OF` bits.
> > > > > >
> > > > > > Once again, it's a non-standard extension so we are free to decide whether to support it or not in the unified shared memory scenario.
> > > > > Can we please first answer my question why we need to care about the existence of `use_device_ptr`? Why does it make a difference for unified shared memory?
> > > > Because use_device_ptr implies the use of a true device pointer and again that needs to be respected even unified memory is used.
> > > Where is that in the spec?
> > The pointers returned by omp_target_alloc and accessed through use_device_ptr are guaranteed to be pointer values that can support pointer arithmetic while still being native device pointers. (Section 2.4 page 61 bottom).
> With `unified_shared_memory` we get: (Section 2.4, page 62, lines 4:7)
> > Additionally, memory in the device data environment of any device visible to OpenMP, including but not limited to the host, is considered part of the device data environment of all devices accessible through OpenMP except as noted below. Every device address allocated through OpenMP device memory routines is a valid host pointer.
> From my understanding, this implies all shared memory can be referenced by a host pointer. This guarantees pointer arithmetic per the C / C++ standard.
Experience from our users show a significant performance degradation when every data is mapped to host memory under unified. Overwhelmingly, users want specific data to be allocated on device & mapped, while at the same time they want generic memory used by the device to be host memory (e.g. for linked list structures).
Recognizing this important use case, the specs has introduced the "close" modifier to the map for users to indicate that the data being mapped should be set "close" to the device, aka for us allocated on the device. Similarly, if the user went through the additional exercise of using target alloc / free, we want to respect this explicit request of the user by allocating the data on the device.
This is why, even under unified memory model, some data is still mapped with duplicated copies between host and device. Thus requiring the maps & use_device_ptr to be tested and enforced when data was actually mapped.
Fortunately for the overhead, we know that only a very small subset of all data accessed will be labeled with "close" or allocated with target_alloc, so the cost of performing this mapping will be very significantly reduced.
CHANGES SINCE LAST ACTION
More information about the Openmp-commits