[Parallel_libs-commits] [PATCH] D23577: [StreamExecutor] Executor add synchronous methods

Jason Henline via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Tue Aug 16 14:56:27 PDT 2016


jhen added a comment.

In https://reviews.llvm.org/D23577#517138, @jprice wrote:

> General question: if we have variants of these `memcpy` methods that take `ElementCount` parameters to allow for partial copies to/from the device allocations, should we also have variants with an `Offset` parameter as well to allow for partial copies that don't start at the origin?


Yes, I think allowing an offset argument is a good idea. I made that change to the `Executor` and `Stream` memcpy methods


================
Comment at: streamexecutor/include/streamexecutor/Executor.h:48
@@ +47,3 @@
+  /// Frees memory previously allocated with allocateDeviceMemory.
+  template <typename T> Error freeDeviceMemory(GlobalDeviceMemory<T> *Memory) {
+    return PExecutor->freeDeviceMemory(Memory);
----------------
jlebar wrote:
> Should we take this by reference?
The internal SE code uses the convention that a `DeviceMemory<T>*` is passed wherever a `T*` would be passed for host data and a `const DeviceMemory<T>&` would be passed wherever a `const T*` would be used for host data.

Do you think it would be better to pass `DeviceMemory<T>&` for `T*` and `const DeviceMemory<T>&` for `const T*`, or something like that?

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:71
@@ +70,3 @@
+  /// Stream::thenMemcpyH2D.
+  template <typename T> Error registerHostMemory(T *Memory) {
+    return PExecutor->registerHostMemory(Memory);
----------------
jprice wrote:
> If this is going to be backing onto something like `cuMemHostRegister`, don't we need the size of the allocation as well?
You're totally right. Thanks for catching that.

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:110
@@ +109,3 @@
+                                DeviceSrc.getElementCount());
+  }
+
----------------
jlebar wrote:
> I wonder if we should assert that the two arrays' sizes are the same, in this case.  Same for the H2D function.
To me that feels too restrictive. Without that check do you think it would be better to rename the method? Maybe synchronousMemcpyWholeSrc or something. I like it how it is now, but I don't want the interface to be confusing.


https://reviews.llvm.org/D23577





More information about the Parallel_libs-commits mailing list