<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Hal, seems to me, not everything is protected. Some buffers are

      reused for different kernels, I assume. Better to ask Alex

      Eichenberger, he knows more about it, I did not not investigate

      this problem.<br>

    </p>

    <p>As to clang, we try to reduce the size of the buffers in the

      global memory for the reduction/lastprivate/etc. vars, which may

      escape their declaration context. These buffers cannot be combined

      in streams mode, need to allocate unique buffer for each

      particular kernel. It is not very hard to do, it is just not

      implemented yet.<br>

    </p>

    <pre class="moz-signature" cols="72">-------------

Best regards,

Alexey Bataev</pre>

    <div class="moz-cite-prefix">30.10.2019 3:22 PM, Finkel, Hal J.

      пишет:<br>

    </div>

    <blockquote type="cite"

      cite="mid:ee8eb27e-db7d-52a4-9d7b-c58b4f49b5e1@anl.gov">

      <pre class="moz-quote-pre" wrap="">On 10/30/19 1:48 PM, GMail wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">

I don't think it will be very easy. It requires some additional work 

in libomptarget + some fixes in the clang itself. Otherwise there 

might be some race conditions.

</pre>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

Can you be more specific? I thought that the mapping table, etc. were 

already appropriately protected.

As a general thought, we should probably have a mode in which the 

runtime is compiled with ThreadSanitizer to check for these kinds of things.

Thanks again,

Hal

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">-------------

Best regards,

Alexey Bataev

30.10.2019 2:40 PM, Finkel, Hal J. via cfe-dev пишет:

</pre>

        <blockquote type="cite">

          <pre class="moz-quote-pre" wrap="">[+Ye, Johannes]

I recall that we've also observed this behavior. Ye, Johannes, we had a

work-around and a patch, correct?

   -Hal

On 10/30/19 12:28 PM, Alessandro Gabbana via cfe-dev wrote:

</pre>

          <blockquote type="cite">

            <pre class="moz-quote-pre" wrap="">Dear All,

I'm using clang 9.0.0 to compile a code which offloads sections of a

code on a GPU using the openmp target construct.

I also use the nowait clause to overlap the execution of certain

kernels and/or host<->device memory transfers.

However, using the nvidia profiler I've noticed that when I compile

the code with clang only one cuda stream is active,

and therefore the execution gets serialized. On the other hand, when

compiling with XLC I see that kernels are executed

on different streams. I could not understand if this is the expected

behavior (e.g. the nowait clause is currently not supported),

or if I'm missing something. I'm using a NVIDIA Tesla P100 GPU and

compiling with the following options:

-target x86_64-pc-linux-gnu -fopenmp

-fopenmp-targets=nvptx64-nvidia-cuda

-Xopenmp-target=nvptx64-nvidia-cuda -march=sm_60

best wishes

Alessandro

_______________________________________________

cfe-dev mailing list

<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>

</pre>

          </blockquote>

        </blockquote>

      </blockquote>

      <pre class="moz-quote-pre" wrap="">

</pre>

    </blockquote>

  </body>

</html>