[Parallel_libs-commits] [PATCH] D23577: [StreamExecutor] Executor add synchronous methods
Justin Lebar via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Fri Aug 19 13:26:25 PDT 2016
jlebar added a comment.
I think this is probably good, modulo a few relatively minor things.
================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:117
@@ +116,3 @@
+
+ /// Gets the base memory for this slice.
+ GlobalDeviceMemory<ElemT> getBaseMemory() const { return BaseMemory; }
----------------
Nit, maybe we can reword this so it doesn't just repeat the function's name.
================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:120
@@ +119,3 @@
+
+ /// Gets the element offset of this slice from the base memory.
+ size_t getOffset() const { return ElementOffset; }
----------------
This is kind of confusing to me, I think because "element offset from base memory" isn't so idiomatic. Maybe we can reword this?
================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:148
@@ +147,3 @@
+ BaseMemory, ElementOffset + DropCount,
+ std::min(ElementCount - ElementOffset - DropCount, TakeCount));
+ }
----------------
Why do we do std::min here instead of asserting?
================
Comment at: streamexecutor/include/streamexecutor/Executor.h:100
@@ +99,3 @@
+ Src.getBaseMemory(), Src.getOffset() * sizeof(T), Dst.data(), 0,
+ Src.getCount() * sizeof(T));
+ }
----------------
Ohhh. *This* is why they're called getElementOffset and getElementCount.
I guess it's not so bad if we don't have any data structure that exposes the byte offset / byte count. But even if not I find this a kind of compelling reason for those names.
What do you think?
================
Comment at: streamexecutor/include/streamexecutor/Executor.h:122
@@ +121,3 @@
+ Src.getOffset() * sizeof(T), Dst, 0,
+ Src.getCount() * sizeof(T));
+ }
----------------
Maybe we can just create an arrayref and call one of the other overloads?
================
Comment at: streamexecutor/include/streamexecutor/Executor.h:139
@@ +138,3 @@
+ return make_error(
+ "copyint too many elements, " + llvm::Twine(ElementCount) +
+ ", to a host array of element count " + llvm::Twine(Dst.size()));
----------------
typo
================
Comment at: streamexecutor/include/streamexecutor/Executor.h:141
@@ +140,3 @@
+ ", to a host array of element count " + llvm::Twine(Dst.size()));
+ return synchronousCopyD2H(Src, Dst.data(), ElementCount);
+ }
----------------
I might actually suggest making this function the base overload; it seems like the most strongly-typed and general?
================
Comment at: streamexecutor/include/streamexecutor/Executor.h:150
@@ -46,2 +149,2 @@
#endif // STREAMEXECUTOR_EXECUTOR_H
----------------
> Should I add back the overloads in order to get around this problem?
Argh. That's the common case, so I guess, yes. I guess you only need it for the two-arg overloads and not the three-arg overload. :-/
https://reviews.llvm.org/D23577
More information about the Parallel_libs-commits
mailing list