[Parallel_libs-commits] [PATCH] D23577: [StreamExecutor] Executor add synchronous methods

Justin Lebar via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Thu Aug 18 13:44:39 PDT 2016


jlebar added inline comments.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:101
@@ +100,3 @@
+/// element count for the size of the slice.
+template <typename ElemT> class DeviceArraySlice {
+public:
----------------
If this must wrap specifically a GlobalDeviceMemory object, we should probably have those words in the name.  Maybe "GlobalDeviceMemorySlice"?  Or is that too long?

It also seems kind of weird to me to focus on "arrays" in the name, function names, and comments, since it's no more or less as much of an array as GlobalDeviceMemory, but we don't name/describe it in the same way.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:106
@@ +105,3 @@
+      : BaseArray(BaseArray), ElementOffset(ElementOffset),
+        ElementCount(ElementCount) {}
+
----------------
We should assert that these are in range.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:110
@@ +109,3 @@
+  size_t getElementOffset() const { return ElementOffset; }
+  size_t getElementCount() const { return ElementCount; }
+
----------------
I think getOffset() and getCount() might be better names, but we should definitely be consistent with the names in GlobalDeviceMemory, as you've done here.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:116
@@ +115,3 @@
+        BaseArray, ElementOffset + DropCount,
+        (DropCount > ElementCount) ? 0 : ElementCount - DropCount);
+  }
----------------
We should think about what exactly we want to be an error in these functions.  I'm not sure that silently accepting out-of-range values makes sense.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:131
@@ +130,3 @@
+                                   std::min(ElementCount, TakeCount));
+  }
+
----------------
Hm...  Maybe ArrayRef's two-arg slice() function is more useful than take_first().  take_first(n) then just becomes slice(0, n), which is fairly clear I think?  And I think the common operation will likely be "starting at offset k, take n elements", which it's not trivial to do using the functions here.

I don't think we need both the one-arg slice() and drop_front, as they're equivalent.

Sorry if I mislead you on this one.  In that example, I meant just to illustrate how we'd chain calls together, not to suggest specifically the API.  But I don't think I was so clear about that.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:192
@@ +191,3 @@
+                                   std::min(getElementCount(), TakeCount));
+  }
+
----------------
We could avoid having these functions entirely by having an asSlice() function, but I'm not sure that's better than this API.  However, it would be nice not to reimplement these tricky functions.  Could we just do e.g.

  return DeviceArraySlice<ElemT>(*this).drop_front(DropCount);

?

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:156
@@ +155,3 @@
+    return synchronousCopyD2H(Src.drop_front(0), Dst);
+  }
+
----------------
My thought is that we'd have an implicit one-arg constructor in DeviceArraySlice<T> so we wouldn't need this overload.

  DeviceArraySlice<T>::DeviceArraySlice<T>(const GlobalDeviceMemory<T>& Mem);

(In the same way, llvm::ArrayRef has an implicit one-arg constructor that accepts an std::vector<T>.)


https://reviews.llvm.org/D23577





More information about the Parallel_libs-commits mailing list