[Openmp-commits] [PATCH] D74145: [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Sat Feb 8 13:00:40 PST 2020
jdoerfert added a comment.
We will probably need "version 2" functions soon which take additional information, e.g., the stream to be used. I would suggest to test this as is and merge it before we go there. It should already allow overlap between threads that offload. The "version 2" will only shrink the overhead per thread. That said, we are working on the `nowait` support so there will be other changes soon anyway.
@ye-luo Do you have a way to test this or do we need to fix the linker issue first?
================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:246
+ // By default let's create 32 streams per device
+ EnvNumStreams = 32;
+ envStr = getenv("LIBOMPTARGET_NUM_STREAMS");
----------------
The hardware will cap the number internally anyway so we should go higher here. Maybe 256?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74145/new/
https://reviews.llvm.org/D74145
More information about the Openmp-commits
mailing list