[Parallel_libs-commits] [PATCH] D24353: [SE] RegisteredHostMemory for async device copies

Jason Henline via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Thu Sep 8 10:27:50 PDT 2016


jhen created this revision.
jhen added a reviewer: jlebar.
jhen added subscribers: parallel_libs-commits, jprice.
Herald added a subscriber: beanz.

Improve the error-prone interface that allows users to pass host
pointers that haven't been registered to asynchronous copy methods. In
CUDA, this is an extremely easy error to make, and instead of failing at
runtime, it succeeds and gives the right answers by turning the async
copy into a sync copy. So, you silently get a huge performance
degradation if you misuse the old interface. This new interface should
prevent that.

https://reviews.llvm.org/D24353

Files:
  streamexecutor/examples/CUDASaxpy.cpp
  streamexecutor/include/streamexecutor/Device.h
  streamexecutor/include/streamexecutor/HostMemory.h
  streamexecutor/include/streamexecutor/PlatformDevice.h
  streamexecutor/include/streamexecutor/Stream.h
  streamexecutor/lib/CMakeLists.txt
  streamexecutor/lib/HostMemory.cpp
  streamexecutor/lib/unittests/DeviceTest.cpp
  streamexecutor/lib/unittests/SimpleHostPlatformDevice.h
  streamexecutor/lib/unittests/StreamTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D24353.70720.patch
Type: text/x-patch
Size: 33577 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/parallel_libs-commits/attachments/20160908/11931018/attachment-0001.bin>


More information about the Parallel_libs-commits mailing list