[Parallel_libs-commits] [PATCH] D23577: [StreamExecutor] Executor add synchronous methods

Justin Lebar via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Wed Aug 17 15:28:26 PDT 2016


jlebar added a comment.

I'm not crazy about how verbose simple operations become with the current
revision.

Just trying to play with this API:

  se::GlobalDeviceMemory<int> DMem;
  int *HPtr;
  MutableArrayRef<int> HArrRef;
  
  // These two operations seem like they're going to be the most common:
  
  // Copy everything, assert sizes equal.
  E.syncCopyD2H(DMem, HArrRef);
  
  // Up to user to ensure that HPtr is big enough.  (We could leave off "length"
  // and copy everything, but since we're passing a raw pointer here, seems like
  // we should be explicit about the length.)
  E.syncCopyD2H(DMem, HPtr, length);
  
  // These two together suggest that we should have a third overload:
  
  // Copy length elements, assert sizes are less than or equal to length.
  E.syncCopyD2H(DMem, HArrRef, length);

These three are probably sufficient for everything other than copying starting
at a nonzero offset within DMem?  I guess that's where makeMutableArrayRef
comes in.  But since this is going to be an uncommon operation, I don't think
we want to uglify all of the other call sites to make this work.

What if we called our class GlobalDeviceMemorySlice<T>.  Not calling it
"ArrayRef" would leave us free to modify its API somewhat:

- We could have an implicit constructor from GlobalDeviceMemory<T>, so we don't have to explicitly call makeMutableArrayRef.

- We wouldn't need mutable and non-mutable versions of this; we could just use const to differentiate.

- If it makes sense, we could even make it so that the way to construct a GlobalDeviceMemorySlice from a GlobalDeviceMemory is by calling functions on GlobalDeviceMemory.  This would resolve the ambiguity around offset vs. length.  Here's a not well-thought-through API:

  GlobalDeviceMemory<int> DMem; E.syncCopyD2H(DMem.drop_front(5).drop_back(42).take_first(11));

I suspect that simply copying slice(), drop_front(), and drop_back() from
ArrayRef would be pretty good.


https://reviews.llvm.org/D23577





More information about the Parallel_libs-commits mailing list