[Openmp-commits] [PATCH] D74145: [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Feb 7 00:19:40 PST 2020

jdoerfert added a subscriber: ye-luo.
jdoerfert added a comment.

Thanks! Two comments below.

@ye-luo once the memory transfers are attached to a stream you should be able to offload synchronously from multiple threads at the same time. Could you pull the patch and test it?

Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:95
   std::vector<std::list<FuncOrGblEntryTy>> FuncGblEntries;
+  std::vector<std::unique_ptr<std::atomic_int>> NextStreamId;
Make it `uint` please.

Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:525
         // need for device copies.
         cuMemcpyHtoD(cuptr, e->addr, sizeof(void *));
         DP("Copy linked variable host address (" DPxMOD ")"
We need the async versions at the HtoD and at the DtoH sides to use the streams. After the async call we directly have to wait for the stream to make it synchronous but on as specific stream.



More information about the Openmp-commits mailing list