[cfe-dev] openmp 4.5 and cuda streams

Luo, Ye via cfe-dev cfe-dev at lists.llvm.org
Thu Oct 31 08:54:15 PDT 2019


Hi Hal,
My experience of llvm/clang so far shows:
1. all the target offload is blocking synchronous using the default stream. nowait is not supported.
2. all the memory transfer calls invoke cudaMemcpy. There are no async calls.
3. I had an experiment in the past turning on CUDA_API_PER_THREAD_DEFAULT_STREAM in libomptarget.
Then I use multiple host threads to do individual blocking synchronous offload. I got it sort of running and saw multple streams but the code crashes due to memory corruption probably due to some data race in libomptarget.
Best,
Ye

________________________________
From: Finkel, Hal J. <hfinkel at anl.gov>
Sent: Wednesday, October 30, 2019 1:40 PM
To: Alessandro Gabbana <gbblsn at unife.it>; cfe-dev at lists.llvm.org <cfe-dev at lists.llvm.org>; Luo, Ye <yeluo at anl.gov>; Doerfert, Johannes <jdoerfert at anl.gov>
Subject: Re: [cfe-dev] openmp 4.5 and cuda streams

[+Ye, Johannes]

I recall that we've also observed this behavior. Ye, Johannes, we had a
work-around and a patch, correct?

  -Hal

On 10/30/19 12:28 PM, Alessandro Gabbana via cfe-dev wrote:
> Dear All,
>
> I'm using clang 9.0.0 to compile a code which offloads sections of a
> code on a GPU using the openmp target construct.
> I also use the nowait clause to overlap the execution of certain
> kernels and/or host<->device memory transfers.
> However, using the nvidia profiler I've noticed that when I compile
> the code with clang only one cuda stream is active,
> and therefore the execution gets serialized. On the other hand, when
> compiling with XLC I see that kernels are executed
> on different streams. I could not understand if this is the expected
> behavior (e.g. the nowait clause is currently not supported),
> or if I'm missing something. I'm using a NVIDIA Tesla P100 GPU and
> compiling with the following options:
>
> -target x86_64-pc-linux-gnu -fopenmp
> -fopenmp-targets=nvptx64-nvidia-cuda
> -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_60
>
> best wishes
>
> Alessandro
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20191031/4fd9d21b/attachment.html>


More information about the cfe-dev mailing list