[Openmp-dev] OpenMP offload implicitly using streams
Ye Luo via Openmp-dev
openmp-dev at lists.llvm.org
Wed Mar 20 09:32:19 PDT 2019
After going through the source, I didn't find CUDA stream support.
Luckily, I only need to add
Then the multiple target goes to different streams and may execute
#pragma omp parallel
#pragma omp target
This is exactly I want.
I know the XL compiler uses streams in a different way but achieves similar
Is there anyone working on using streams with openmp target in llvm?
Will clang-ykt get something similar to XL and upstream to the mainline?
If we just add #define CUDA_API_PER_THREAD_DEFAULT_STREAM in the cuda rtl,
will it be a trouble?
As a compiler user, I'd like to have a better solution rather than having a
patch just for myself.
Ye Luo, Ph.D.
Computational Science Division & Leadership Computing Facility
Argonne National Laboratory
Ye Luo <xw111luoye at gmail.com> 于2019年3月17日周日 下午2:26写道：
> How to turn on streams when using OpenMP offload?
> When different host threads individually start target regions (even not
> using nowait). The offloaded computation goes to different CUDA streams and
> may execute concurrently. This is currently available in XL.
> With Clang, nvprof shows only the run only uses the default stream.
> Is there a way to do that with Clang?
> On the other hand,
> nvcc has option --*default*-*stream per*-*thread*
> I'm not familar with clang CUDA, is there a similar option?
> Ye Luo, Ph.D.
> Computational Science Division & Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Openmp-dev