[Parallel_libs-commits] [PATCH] D24353: [SE] RegisteredHostMemory for async device copies

Jason Henline via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Fri Sep 9 17:20:59 PDT 2016


jhen added inline comments.

================
Comment at: streamexecutor/include/streamexecutor/HostMemory.h:45
@@ +44,3 @@
+      : MutableArrayRef(const_cast<ElemT *>(Registered.getPointer()),
+                        Registered.getElementCount()) {}
+
----------------
jlebar wrote:
> jlebar wrote:
> > I'm not sure that a const_cast is the right thing to do.  Instead, could we have two constructors, one which takes a const RHM<ElemT> and one which takes a non-const RHM<ElemT>?
> > 
> > Also, how does this work when ElemT = "const int" and we get an RHM<ElemT>?  I would expect you need some const-decay here.
> If we go with this approach, we should make sure we can convert from RHMSlice<const T> to RHMSlice<T>, which I don't think works atm.
Switching back to mutable array.

================
Comment at: streamexecutor/include/streamexecutor/HostMemory.h:48
@@ +47,3 @@
+  ElemT *getPointer() { return MutableArrayRef.data(); }
+  const ElemT *getPointer() const { return MutableArrayRef.data(); }
+  size_t getElementCount() const { return MutableArrayRef.size(); }
----------------
jlebar wrote:
> Hmmm...
> 
> In a vacuum, I think we'd want just one overload,
> 
>   ElemT *getPointer() const;
> 
> This matches what C++17's std::slice (nee array_view) does.  And it sort of makes sense: We already have constness in the template type to determine whether or not we can modify the underlying ElemTs.
> 
> (By the way, if we stick with this approach, we should explain in a comment about putting const T in the template param.)
> 
> But I see that we have the two overloads on RHM, and there they make more sense.  So...I am not sure what we should do here.
> 
> One option is we could also make RHM unconditionally give out non-const pointers (and mutable slices).  That would make it behave like unique_ptr -- if I have a const unique_ptr<int[]>, the get() function still gives me a non-const pointer back.
> 
> I guess one argument against merging the slice classes is that it seems like we're having difficulty keeping them straight within Stream.h.  So I think I'm OK splitting them back out for now.  It shouldn't complicate the Stream interface any.
Switching back to mutable array.

================
Comment at: streamexecutor/include/streamexecutor/HostMemory.h:99
@@ +98,3 @@
+                                   "RegisteredHostMemoryBase with a null "
+                                   "platform device");
+  }
----------------
jlebar wrote:
> Nit, clang-format is being conservative and not re-merging strings, but this should probably be reflowed.
Actually, clang-format really seems to like this. It won't seem to break after the `&&` even if I manually combine the string literals into one long one. So I'll just let clang-format have its way here.

================
Comment at: streamexecutor/include/streamexecutor/Stream.h:172
@@ +171,3 @@
+  template <typename SrcTy, typename DstTy>
+  Stream &thenCopyD2H(SrcTy &&Src, DstTy &&Dst, size_t ElementCount) {
+    using SrcElemTy = typename std::remove_reference<SrcTy>::type::ElementTy;
----------------
jlebar wrote:
> Is it worth merging this with the overload that explicitly takes slices, or do you think that would be too confusing?
Yes, I did it now. I don't know why I didn't before.

================
Comment at: streamexecutor/include/streamexecutor/Stream.h:178
@@ +177,3 @@
+    GlobalDeviceMemorySlice<SrcElemTy> SrcSlice(Src);
+    RegisteredHostMemorySlice<DstElemTy> DstSlice(Dst);
+    return thenCopyD2H(SrcSlice, DstSlice, ElementCount);
----------------
jlebar wrote:
> There should be const in some of these?
Using mutable array now.


https://reviews.llvm.org/D24353





More information about the Parallel_libs-commits mailing list