[Parallel_libs-commits] [PATCH] D24353: [SE] RegisteredHostMemory for async device copies
Jason Henline via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Fri Sep 9 17:20:59 PDT 2016
jhen added inline comments.
================
Comment at: streamexecutor/include/streamexecutor/HostMemory.h:45
@@ +44,3 @@
+ : MutableArrayRef(const_cast<ElemT *>(Registered.getPointer()),
+ Registered.getElementCount()) {}
+
----------------
jlebar wrote:
> jlebar wrote:
> > I'm not sure that a const_cast is the right thing to do. Instead, could we have two constructors, one which takes a const RHM<ElemT> and one which takes a non-const RHM<ElemT>?
> >
> > Also, how does this work when ElemT = "const int" and we get an RHM<ElemT>? I would expect you need some const-decay here.
> If we go with this approach, we should make sure we can convert from RHMSlice<const T> to RHMSlice<T>, which I don't think works atm.
Switching back to mutable array.
================
Comment at: streamexecutor/include/streamexecutor/HostMemory.h:48
@@ +47,3 @@
+ ElemT *getPointer() { return MutableArrayRef.data(); }
+ const ElemT *getPointer() const { return MutableArrayRef.data(); }
+ size_t getElementCount() const { return MutableArrayRef.size(); }
----------------
jlebar wrote:
> Hmmm...
>
> In a vacuum, I think we'd want just one overload,
>
> ElemT *getPointer() const;
>
> This matches what C++17's std::slice (nee array_view) does. And it sort of makes sense: We already have constness in the template type to determine whether or not we can modify the underlying ElemTs.
>
> (By the way, if we stick with this approach, we should explain in a comment about putting const T in the template param.)
>
> But I see that we have the two overloads on RHM, and there they make more sense. So...I am not sure what we should do here.
>
> One option is we could also make RHM unconditionally give out non-const pointers (and mutable slices). That would make it behave like unique_ptr -- if I have a const unique_ptr<int[]>, the get() function still gives me a non-const pointer back.
>
> I guess one argument against merging the slice classes is that it seems like we're having difficulty keeping them straight within Stream.h. So I think I'm OK splitting them back out for now. It shouldn't complicate the Stream interface any.
Switching back to mutable array.
================
Comment at: streamexecutor/include/streamexecutor/HostMemory.h:99
@@ +98,3 @@
+ "RegisteredHostMemoryBase with a null "
+ "platform device");
+ }
----------------
jlebar wrote:
> Nit, clang-format is being conservative and not re-merging strings, but this should probably be reflowed.
Actually, clang-format really seems to like this. It won't seem to break after the `&&` even if I manually combine the string literals into one long one. So I'll just let clang-format have its way here.
================
Comment at: streamexecutor/include/streamexecutor/Stream.h:172
@@ +171,3 @@
+ template <typename SrcTy, typename DstTy>
+ Stream &thenCopyD2H(SrcTy &&Src, DstTy &&Dst, size_t ElementCount) {
+ using SrcElemTy = typename std::remove_reference<SrcTy>::type::ElementTy;
----------------
jlebar wrote:
> Is it worth merging this with the overload that explicitly takes slices, or do you think that would be too confusing?
Yes, I did it now. I don't know why I didn't before.
================
Comment at: streamexecutor/include/streamexecutor/Stream.h:178
@@ +177,3 @@
+ GlobalDeviceMemorySlice<SrcElemTy> SrcSlice(Src);
+ RegisteredHostMemorySlice<DstElemTy> DstSlice(Dst);
+ return thenCopyD2H(SrcSlice, DstSlice, ElementCount);
----------------
jlebar wrote:
> There should be const in some of these?
Using mutable array now.
https://reviews.llvm.org/D24353
More information about the Parallel_libs-commits
mailing list