<div dir="ltr"><div>Yes. It makes the code working.</div><div>Do you have any idea about the code pattern?<br></div><div>Before the compiler fix gets in, I can put the workaround in the real code which has multiple places similar to the miniapp.</div><div>Best,<br></div><div>Ye<br></div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">===================<br>
Ye Luo, Ph.D.<br>Computational Science Division & Leadership Computing Facility<br>
Argonne National Laboratory</div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 19, 2020 at 4:35 PM Alexey.Bataev <<a href="mailto:a.bataev@outlook.com">a.bataev@outlook.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Try to use `omp parallel firstprivate(first, last,
offload_scratch_iVP_ptr, psiinv_ptr, i)` as the inner parallel
directive.<br>
</p>
<pre cols="72">-------------
Best regards,
Alexey Bataev</pre>
<div>16.06.2020 9:26 PM, Ye Luo via
Openmp-dev пишет:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi all,</div>
<div>I got libomptarget failure when offloading the same kernel
function from multiple host threads. I reported the bug as <a href="https://bugs.llvm.org/show_bug.cgi?id=46257" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=46257</a>.</div>
<div><br>
</div>
<div>After a bit investigation, I'm able to narrow down a bit
the scope of the issue.</div>
<div>The error from CUDA runtime is "an illegal memory access
was encountered".</div>
<div><br>
</div>
<div>
<div>A) On the libomptarget side<br>
</div>
</div>
<div>In libomptarget/plugins/cuda/src/rtl.cpp, I added a few
synchronization before and after cuLaunchKernel and a mutex to
protect the kernel execution.<br>
</div>
<div><br>
</div>
<div> Err = cuCtxSynchronize();<br>
if (!checkResult(Err, "Error before cuLaunchKernel\n"))<br>
return OFFLOAD_FAIL;<br>
<br>
//kernelmtx.lock();<br>
Err = cuLaunchKernel(KernelInfo->Func,
CudaBlocksPerGrid, /* gridDimY */ 1,<br>
/* gridDimZ */ 1,
CudaThreadsPerBlock,<br>
/* blockDimY */ 1, /* blockDimZ */ 1,<br>
/* sharedMemBytes */ 0, Stream,
&Args[0], nullptr);<br>
if (!checkResult(Err, "Error returned from
cuLaunchKernel\n"))<br>
return OFFLOAD_FAIL;<br>
<br>
Err = cuCtxSynchronize();<br>
//kernelmtx.unlock();<br>
if (!checkResult(Err, "Error after cuLaunchKernel\n"))<br>
return OFFLOAD_FAIL;</div>
<div><br>
</div>
<div>Without the mutex. the first error from all the threads is
"Error after cuLaunchKernel". After enabling the mutex, the
code runs well. If I move the mutex unlock() before the
cuCtxSynchronize, the code still runs into error. So I think
the error comes from kernel execution not something else.</div>
<div><br>
</div>
<div>2) on my application side, I tried to see what triggers the
error. Once I commented out "omp parallel" in the hierarchical
parallelism. <a href="https://github.com/QMCPACK/miniqmc/blob/5a11c4131dbf91bf9f9977057a03aee485368f0d/src/QMCWaveFunctions/einspline_spo_omp.cpp#L258" target="_blank">https://github.com/QMCPACK/miniqmc/blob/5a11c4131dbf91bf9f9977057a03aee485368f0d/src/QMCWaveFunctions/einspline_spo_omp.cpp#L258</a><br>
</div>
<div>Multi host threads offloading start to run without error.</div>
<div><br>
</div>
<div>I'd like to ask the wider community to see potential causes
of this bug.<br>
</div>
<div>Is there a thread unsafe state machine with a kernel
generated from a target region? Or any other potential reason
for the error?</div>
<div><br>
</div>
<div>Best,</div>
<div>Ye<br>
</div>
<div>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">===================<br>
Ye Luo, Ph.D.<br>
Computational Science Division & Leadership
Computing Facility<br>
Argonne National Laboratory</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Openmp-dev mailing list
<a href="mailto:Openmp-dev@lists.llvm.org" target="_blank">Openmp-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev</a>
</pre>
</blockquote>
</div>
</blockquote></div>