[Openmp-commits] [PATCH] D74145: [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently

Shilei Tian via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Feb 7 11:48:56 PST 2020

tianshilei1992 added inline comments.

Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:259
+    for (std::unique_ptr<std::atomic_int> &Ptr : NextStreamId) {
+      Ptr = std::unique_ptr<std::atomic_int>(new std::atomic_int(0));
+    }
JonChesterfield wrote:
> jdoerfert wrote:
> > tianshilei1992 wrote:
> > > JonChesterfield wrote:
> > > > If we do need the pointer wrapper, this should be make_unique
> > > `make_unique` only works since C++14.
> > Do we have llvm::make_unique? But maybe not necessarily good to use it here anyway. @jon ok to stick with this for now?
> llvm::make_unique was removed by D66259, as we're now assuming C++14. They're semantically identical in this context so it doesn't matter much.
Do you mean that we can assume -std=c++14 is always true?

Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:832
   CUresult sync_err = cuCtxSynchronize();
   if (sync_err != CUDA_SUCCESS) {
ye-luo wrote:
> tianshilei1992 wrote:
> > ye-luo wrote:
> > > This synchronization should be replaced with stream wait.
> > Are you referring to `cudaStreamWaitEvent`?
> I mean cuStreamSynchronize
Oh, I got you. Good one, in case of blocking other threads, although the offloading have finished.



More information about the Openmp-commits mailing list