[Parallel_libs-commits] [PATCH] D23577: [StreamExecutor] Executor add synchronous methods

Justin Lebar via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Fri Aug 19 13:26:25 PDT 2016


jlebar added a comment.

I think this is probably good, modulo a few relatively minor things.


================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:117
@@ +116,3 @@
+
+  /// Gets the base memory for this slice.
+  GlobalDeviceMemory<ElemT> getBaseMemory() const { return BaseMemory; }
----------------
Nit, maybe we can reword this so it doesn't just repeat the function's name.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:120
@@ +119,3 @@
+
+  /// Gets the element offset of this slice from the base memory.
+  size_t getOffset() const { return ElementOffset; }
----------------
This is kind of confusing to me, I think because "element offset from base memory" isn't so idiomatic.  Maybe we can reword this?

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:148
@@ +147,3 @@
+        BaseMemory, ElementOffset + DropCount,
+        std::min(ElementCount - ElementOffset - DropCount, TakeCount));
+  }
----------------
Why do we do std::min here instead of asserting?

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:100
@@ +99,3 @@
+        Src.getBaseMemory(), Src.getOffset() * sizeof(T), Dst.data(), 0,
+        Src.getCount() * sizeof(T));
+  }
----------------
Ohhh.  *This* is why they're called getElementOffset and getElementCount.

I guess it's not so bad if we don't have any data structure that exposes the byte offset / byte count.  But even if not I find this a kind of compelling reason for those names.

What do you think?

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:122
@@ +121,3 @@
+                                         Src.getOffset() * sizeof(T), Dst, 0,
+                                         Src.getCount() * sizeof(T));
+  }
----------------
Maybe we can just create an arrayref and call one of the other overloads?

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:139
@@ +138,3 @@
+      return make_error(
+          "copyint too many elements, " + llvm::Twine(ElementCount) +
+          ", to a host array of element count " + llvm::Twine(Dst.size()));
----------------
typo

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:141
@@ +140,3 @@
+          ", to a host array of element count " + llvm::Twine(Dst.size()));
+    return synchronousCopyD2H(Src, Dst.data(), ElementCount);
+  }
----------------
I might actually suggest making this function the base overload; it seems like the most strongly-typed and general?

================
Comment at: streamexecutor/include/streamexecutor/Executor.h:150
@@ -46,2 +149,2 @@
 
 #endif // STREAMEXECUTOR_EXECUTOR_H
----------------
> Should I add back the overloads in order to get around this problem?

Argh.  That's the common case, so I guess, yes.  I guess you only need it for the two-arg overloads and not the three-arg overload.  :-/


https://reviews.llvm.org/D23577





More information about the Parallel_libs-commits mailing list