[Openmp-commits] [PATCH] D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps
George Rokos via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Mon Jul 29 11:05:31 PDT 2019
grokos added inline comments.
================
Comment at: libomptarget/src/device.cpp:167
+ // If unified shared memory is active implicitly mapped variables that are not
+ // privatized, use host address. Any explicitely mapped variables also use
+ // host address where correctness is not impeded. In all other cases
----------------
explicitely --> explicitly
================
Comment at: libomptarget/src/device.cpp:174-176
+ (IsImplicit || !((lr.Flags.IsContained ||
+ lr.Flags.ExtendsBefore ||
+ lr.Flags.ExtendsAfter) && !Size))) {
----------------
I'm confused with this condition. What do we want to test?
The condition can be simplified to:
`IsImplicit || !lr.Flags.IsContained || Size`
because if Size==0, then `lookupMapping` cannot return `ExtendsBefore/After`, it may only return `IsContained`. So we want a mapping that's either
# implicit or
# explicit and it is either not contained or has Size>0.
The latter doesn't make sense to me... For instance, what if Size>0 and the mapping explicitly extends after? The condition will evaluate to true but this is invalid use no matter whether we use unified shared memory or not.
================
Comment at: libomptarget/src/device.cpp:160
void *DeviceTy::getOrAllocTgtPtr(void *HstPtrBegin, void *HstPtrBase,
- int64_t Size, bool &IsNew, bool IsImplicit, bool UpdateRefCount) {
+ int64_t Size, bool &IsNew, bool &IsHostPtr, bool IsImplicit,
+ bool UpdateRefCount, bool IsInUseDevicePtrClause) {
----------------
`IsHostPtr` is never initialized in this function, I think it's better to set it to false explicitly instead of relying on the assumption that the caller has set it to false. You can init it right after `rc`.
================
Comment at: libomptarget/src/device.cpp:166-167
- // Check if the pointer is contained.
- if (lr.Flags.IsContained ||
- ((lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter) && IsImplicit)) {
+ // If unified shared memory is active implicitly mapped variables that are not
+ // privatized, use host address. Any explicitely mapped variables also use
+ // host address where correctness is not impeded. In all other cases
----------------
This comma is confusing here, can you move it after "active"?
If unified shared memory is active, implicitly mapped variables that are not privatized use host address.
================
Comment at: libomptarget/src/device.cpp:178
auto &HT = *lr.Entry;
- IsNew = false;
-
- if (UpdateRefCount)
- ++HT.RefCount;
-
- uintptr_t tp = HT.TgtPtrBegin + ((uintptr_t)HstPtrBegin - HT.HstPtrBegin);
- DP("Mapping exists%s with HstPtrBegin=" DPxMOD ", TgtPtrBegin=" DPxMOD ", "
- "Size=%ld,%s RefCount=%s\n", (IsImplicit ? " (implicit)" : ""),
- DPxPTR(HstPtrBegin), DPxPTR(tp), Size,
- (UpdateRefCount ? " updated" : ""),
- (CONSIDERED_INF(HT.RefCount)) ? "INF" :
- std::to_string(HT.RefCount).c_str());
- rc = (void *)tp;
- } else if ((lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter) && !IsImplicit) {
- // Explicit extension of mapped data - not allowed.
- DP("Explicit extension of mapping is not allowed.\n");
- } else if (Size) {
- // If it is not contained and Size > 0 we should create a new entry for it.
- IsNew = true;
- uintptr_t tp = (uintptr_t)RTL->data_alloc(RTLDeviceID, Size, HstPtrBegin);
- DP("Creating new map entry: HstBase=" DPxMOD ", HstBegin=" DPxMOD ", "
- "HstEnd=" DPxMOD ", TgtBegin=" DPxMOD "\n", DPxPTR(HstPtrBase),
- DPxPTR(HstPtrBegin), DPxPTR((uintptr_t)HstPtrBegin + Size), DPxPTR(tp));
- HostDataToTargetMap.push_front(HostDataToTargetTy((uintptr_t)HstPtrBase,
- (uintptr_t)HstPtrBegin, (uintptr_t)HstPtrBegin + Size, tp));
+ uintptr_t tp = HT.HstPtrBegin + ((uintptr_t)HstPtrBegin - HT.HstPtrBegin);
+ DP("Return HstPtrBegin " DPxMOD " Size=%ld RefCount=%s\n", DPxPTR(tp),
----------------
This evaluates to just `HstPtrBegin`, the two `HT.HstPtrBegin` cancel out, so you can skip defining `tp` altogether.
================
Comment at: libomptarget/src/device.cpp:237
LookupResult lr = lookupMapping(HstPtrBegin, Size);
-
- if (lr.Flags.IsContained || lr.Flags.ExtendsBefore || lr.Flags.ExtendsAfter) {
+ IsHostPtr = false;
+
----------------
This init can be moved outside the locked-mutex region.
================
Comment at: libomptarget/src/device.cpp:248
+ IsLast = false;
+ uintptr_t tp = HT.HstPtrBegin + ((uintptr_t)HstPtrBegin - HT.HstPtrBegin);
+ DP("Get HstPtrBegin " DPxMOD " Size=%ld RefCount=%s\n", DPxPTR(tp),
----------------
Same here, just use `HstPtrBegin` instead of `tp`.
================
Comment at: libomptarget/src/omptarget.cpp:247-249
+ // TODO: Check if this is correct
+ bool IsInUseDevicePtrClause = arg_types[i] & OMP_TGT_MAPTYPE_TARGET_PARAM &&
+ arg_types[i] & OMP_TGT_MAPTYPE_RETURN_PARAM;
----------------
This is correct, with one little exception. Although the OpenMP standard does not mandate it, upstream clang supports `use_device_ptr` on pointers which are struct members. Because they are struct members, they are not marked with `TARGET_PARAM` (only the combined entry is considered a target parameter, not the individual members). On the other hand, they are marked with `PTR_AND_OBJ` and have some value in the `MEMBER_OF` bits.
Once again, it's a non-standard extension so we are free to decide whether to support it or not in the unified shared memory scenario.
================
Comment at: libomptarget/src/omptarget.cpp:259
// base is address of pointer.
+ // TODO: Check if IsInUseDevicePtrClause needs to ne passed.
Pointer_TgtPtrBegin = Device.getOrAllocTgtPtr(HstPtrBase, HstPtrBase,
----------------
ne --> be
================
Comment at: libomptarget/src/omptarget.cpp:379-381
+ // TODO: Check if this is correct
+ bool IsInUseDevicePtrClause = arg_types[i] & OMP_TGT_MAPTYPE_TARGET_PARAM &&
+ arg_types[i] & OMP_TGT_MAPTYPE_RETURN_PARAM;
----------------
Same here (if we decide to support the struct member case).
================
Comment at: libomptarget/src/omptarget.cpp:507
+ // Act as if a copy to device was successful in the case of
+ // unified memory where only a host version of the data exists.
+ if (Device.RTLRequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY &&
----------------
unified shared memory
================
Comment at: libomptarget/src/omptarget.cpp:541
+ // No need to actually copy any data to device. The data is available
+ // on the host and can be accessed by the device via unified memory.
+ // When host and device pointers are equal it means that this is called
----------------
unified shared memory
================
Comment at: libomptarget/src/omptarget.cpp:553-578
uintptr_t lb = (uintptr_t) HstPtrBegin;
uintptr_t ub = (uintptr_t) HstPtrBegin + MapSize;
Device.ShadowMtx.lock();
for (ShadowPtrListTy::iterator it = Device.ShadowPtrMap.begin();
it != Device.ShadowPtrMap.end(); ++it) {
void **ShadowHstPtrAddr = (void**) it->first;
if ((uintptr_t) ShadowHstPtrAddr < lb)
----------------
I think we can skip the whole shadow pointer loop when we use unified shared memory.
================
Comment at: libomptarget/test/offloading/requires_unified_shared_memory_local.c:25-26
+
+ host_data = (long long) &data[0];
+ host_alloc = (long long) &alloc[0];
+
----------------
Hahnfeld wrote:
> gtbercea wrote:
> > Hahnfeld wrote:
> > > Why are the pointers cast to `long long`? Can we just compare them as `void *`?
> > Is there a problem I'm missing if they are cast to long long?
> Yes, casting to `long long` might truncate, or at least I'm not aware that it guarantees to be of the same size as pointers.
You can cast them to `uintptr_t`, it is a datatype guaranteed to be long enough to represent a pointer.
Repository:
rOMP OpenMP
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65001/new/
https://reviews.llvm.org/D65001
More information about the Openmp-commits
mailing list