[Openmp-commits] [PATCH] D74145: [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Sat Feb 8 13:00:40 PST 2020

jdoerfert added a comment.

We will probably need "version 2" functions soon which take additional information, e.g., the stream to be used. I would suggest to test this as is and merge it before we go there. It should already allow overlap between threads that offload. The "version 2" will only shrink the overhead per thread. That said, we are working on the `nowait` support so there will be other changes soon anyway.

@ye-luo Do you have a way to test this or do we need to fix the linker issue first?

Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:246
+    // By default let's create 32 streams per device
+    EnvNumStreams = 32;
+    envStr = getenv("LIBOMPTARGET_NUM_STREAMS");
The hardware will cap the number internally anyway so we should go higher here. Maybe 256?



More information about the Openmp-commits mailing list