[Openmp-commits] [openmp] [Libomptarget] Fix RPC-based malloc on NVPTX (PR #72440)
Jan Patrick Lehr via Openmp-commits
openmp-commits at lists.llvm.org
Tue Nov 21 00:53:10 PST 2023
================
@@ -486,6 +494,16 @@ struct CUDADeviceTy : public GenericDeviceTy {
Res = cuMemAllocManaged(&DevicePtr, Size, CU_MEM_ATTACH_GLOBAL);
MemAlloc = (void *)DevicePtr;
break;
+ case TARGET_ALLOC_DEVICE_NON_BLOCKING: {
+ CUstream Stream;
+ if (Res = cuStreamCreate(&Stream, CU_STREAM_NON_BLOCKING))
----------------
jplehr wrote:
> So, there is no AsyncInfoTy object here, but I am also confused why we can't simply call getStream and later return it again. Stream creation and destruction is not free, hence our resource pools.
I mean, `getStream` requires an `AsyncInfoWrapperTy` reference (at least the one I found), no? Obviously nothing would stop you from having a local object that you can use to synchronize once your `malloc` is done. Or am I missing something fundamental here?
https://github.com/llvm/llvm-project/pull/72440
More information about the Openmp-commits
mailing list