[Parallel_libs-commits] [PATCH] D24528: [SE] Pack global dev handle addresses

Tue Sep 13 17:01:07 PDT 2016

jhen added inline comments.

================
Comment at: streamexecutor/include/streamexecutor/DeviceMemory.h:170
@@ -166,3 +169,3 @@
   Device *TheDevice;  // Pointer to the device on which this memory lives.
   const void *Handle; // Platform-dependent value representing allocated memory.
   size_t ByteCount;   // Size in bytes of this allocation.
----------------
jlebar wrote:
> Clearly in the multithreaded case you can't modify a GlobalDeviceMemoryBase concurrently with a call to thenLaunch().  That's true of any parameters to any function call, so I am not sure that's even worth warning about.
> 
> What I was worried about was a single-threaded case, something like this:
> 
>   GlobalDeviceMemory<int> other_mem;
>   {  
>     GlobalDeviceMemory<int> mem;
>     Stream->thenLaunch(foo_kernel, mem);
>     other_mem = std::move(mem);
>   }
>   Stream->block();
> 
> Is this safe?  That is, do we use &mem.Handle only within thenLaunch?  (We don't actually have to have a separate scope and so on for us to hit this same problem.)
> 
> (I have to admit, if any of this is 5% of some applications' runtime, it seems like we could do a lot better even than what we have here.  I'm not sure how, but hot code is hot...)
Oh, that's right. I think we spoke about this before. There should be no problem with your example, the platform should be responsible for copying the arguments (using memcpy) before returning from a launch call. This will match the guarantee currently provided by the CUDA driver library.

I removed the unnecessary warning.


https://reviews.llvm.org/D24528