[Parallel_libs-commits] [PATCH] D24353: [SE] RegisteredHostMemory for async device copies
Jason Henline via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Thu Sep 8 12:05:11 PDT 2016
jhen added a comment.
After an offline chat with jlebar, we decided to get rid of the `allocateHostMemory` functions in the device interface. We thought it was confusing that the first patch in this review made a `RegisteredHostMemory` object that only sometimes owned its underlying pointer.
By removing `allocateHostMemory` the object never has to own its underlying pointer. Possible concerns with removing this function:
- The interface may be slightly more annoying for users because they have to allocate their own memory.
- Performance may be worse if users have to make two calls instead of just one to get registered host memory.
- Memory that users allocate themselves may not have optimal alignment for the registration process.
If any of these problems are later shown to be a bigger issue, we can re-introduce the host allocation functions with a different interface than the proposal from the first patch.
There are a couple of options that might work for the new interface:
- Have `allocateHostMemory` return a `pair<RegisteredHostMemory, std::unique_ptr>` so that the `unique_ptr` is responsible for freeing the memory and the `RegisteredHostMemory` is responsible for un-registering it. (Maybe a problem if the `unique_ptr` goes out of scope first.
- Create a new type `OwnedRegisteredHostMemory` that always owns the memory and the registration of the memory. Have methods to create a slice out of this new type, and support this type as an argument to all the `thenCopy` `Stream` methods (`std::enable_if` should make that pretty easy).
https://reviews.llvm.org/D24353
More information about the Parallel_libs-commits
mailing list