<div dir="ltr">It is working well now. I tested both the miniapp and the full application.<br clear="all"><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">===================<br>
Ye Luo, Ph.D.<br>Computational Science Division & Leadership Computing Facility<br>
Argonne National Laboratory</div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jun 22, 2020 at 11:56 AM Alexey.Bataev <<a href="mailto:a.bataev@outlook.com">a.bataev@outlook.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  
  <div>
    <p>Hi Ye, could you try your code with the attached patch for the
      clang? <br>
    </p>
    <pre cols="72">-------------
Best regards,
Alexey Bataev</pre>
    <div>19.06.2020 6:45 PM, Ye Luo via
      Openmp-dev пишет:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>Yes. It makes the code working.</div>
        <div>Do you have any idea about the code pattern?<br>
        </div>
        <div>Before the compiler fix gets in, I can put the workaround
          in the real code which has multiple places similar to the
          miniapp.</div>
        <div>Best,<br>
        </div>
        <div>Ye<br>
        </div>
        <div>
          <div>
            <div dir="ltr">
              <div dir="ltr">
                <div>
                  <div dir="ltr">===================<br>
                    Ye Luo, Ph.D.<br>
                    Computational Science Division & Leadership
                    Computing Facility<br>
                    Argonne National Laboratory</div>
                </div>
              </div>
            </div>
          </div>
          <br>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Fri, Jun 19, 2020 at 4:35
          PM Alexey.Bataev <<a href="mailto:a.bataev@outlook.com" target="_blank">a.bataev@outlook.com</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>
            <p>Try to use `omp parallel firstprivate(first, last,
              offload_scratch_iVP_ptr, psiinv_ptr, i)` as the inner
              parallel directive.<br>
            </p>
            <pre cols="72">-------------
Best regards,
Alexey Bataev</pre>
            <div>16.06.2020 9:26 PM, Ye Luo via Openmp-dev пишет:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>Hi all,</div>
                <div>I got libomptarget failure when offloading the same
                  kernel function from multiple host threads. I reported
                  the bug as <a href="https://bugs.llvm.org/show_bug.cgi?id=46257" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=46257</a>.</div>
                <div><br>
                </div>
                <div>After a bit investigation, I'm able to narrow down
                  a bit the scope of the issue.</div>
                <div>The error from CUDA runtime is "an illegal memory
                  access was encountered".</div>
                <div><br>
                </div>
                <div>
                  <div>A) On the libomptarget side<br>
                  </div>
                </div>
                <div>In libomptarget/plugins/cuda/src/rtl.cpp, I added a
                  few synchronization before and after cuLaunchKernel
                  and a mutex to protect the kernel execution.<br>
                </div>
                <div><br>
                </div>
                <div>    Err = cuCtxSynchronize();<br>
                      if (!checkResult(Err, "Error before
                  cuLaunchKernel\n"))<br>
                        return OFFLOAD_FAIL;<br>
                  <br>
                      //kernelmtx.lock();<br>
                      Err = cuLaunchKernel(KernelInfo->Func,
                  CudaBlocksPerGrid, /* gridDimY */ 1,<br>
                                           /* gridDimZ */ 1,
                  CudaThreadsPerBlock,<br>
                                           /* blockDimY */ 1, /*
                  blockDimZ */ 1,<br>
                                           /* sharedMemBytes */ 0,
                  Stream, &Args[0], nullptr);<br>
                      if (!checkResult(Err, "Error returned from
                  cuLaunchKernel\n"))<br>
                        return OFFLOAD_FAIL;<br>
                  <br>
                      Err = cuCtxSynchronize();<br>
                      //kernelmtx.unlock();<br>
                      if (!checkResult(Err, "Error after
                  cuLaunchKernel\n"))<br>
                        return OFFLOAD_FAIL;</div>
                <div><br>
                </div>
                <div>Without the mutex. the first error from all the
                  threads is "Error after cuLaunchKernel". After
                  enabling the mutex, the code runs well. If I move the
                  mutex unlock() before the cuCtxSynchronize, the code
                  still runs into error. So I think the error comes from
                  kernel execution not something else.</div>
                <div><br>
                </div>
                <div>2) on my application side, I tried to see what
                  triggers the error. Once I commented out "omp
                  parallel" in the hierarchical parallelism. <a href="https://github.com/QMCPACK/miniqmc/blob/5a11c4131dbf91bf9f9977057a03aee485368f0d/src/QMCWaveFunctions/einspline_spo_omp.cpp#L258" target="_blank">https://github.com/QMCPACK/miniqmc/blob/5a11c4131dbf91bf9f9977057a03aee485368f0d/src/QMCWaveFunctions/einspline_spo_omp.cpp#L258</a><br>
                </div>
                <div>Multi host threads offloading start to run without
                  error.</div>
                <div><br>
                </div>
                <div>I'd like to ask the wider community to see
                  potential causes of this bug.<br>
                </div>
                <div>Is there a thread unsafe state machine with a
                  kernel generated from a target region? Or any other
                  potential reason for the error?</div>
                <div><br>
                </div>
                <div>Best,</div>
                <div>Ye<br>
                </div>
                <div>
                  <div dir="ltr">
                    <div dir="ltr">
                      <div>
                        <div dir="ltr">===================<br>
                          Ye Luo, Ph.D.<br>
                          Computational Science Division &
                          Leadership Computing Facility<br>
                          Argonne National Laboratory</div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
              <br>
              <fieldset></fieldset>
              <pre>_______________________________________________
Openmp-dev mailing list
<a href="mailto:Openmp-dev@lists.llvm.org" target="_blank">Openmp-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev</a>
</pre>
            </blockquote>
          </div>
        </blockquote>
      </div>
      <br>
      <fieldset></fieldset>
      <pre>_______________________________________________
Openmp-dev mailing list
<a href="mailto:Openmp-dev@lists.llvm.org" target="_blank">Openmp-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev</a>
</pre>
    </blockquote>
  </div>

</blockquote></div>