[Openmp-dev] Potential offload kernel data race

Ye Luo via Openmp-dev openmp-dev at lists.llvm.org
Tue Jun 16 18:26:21 PDT 2020

Hi all,
I got libomptarget failure when offloading the same kernel function from
multiple host threads. I reported the bug as

After a bit investigation, I'm able to narrow down a bit the scope of the
The error from CUDA runtime is "an illegal memory access was encountered".

A) On the libomptarget side
In libomptarget/plugins/cuda/src/rtl.cpp, I added a few synchronization
before and after cuLaunchKernel and a mutex to protect the kernel execution.

    Err = cuCtxSynchronize();
    if (!checkResult(Err, "Error before cuLaunchKernel\n"))
      return OFFLOAD_FAIL;

    Err = cuLaunchKernel(KernelInfo->Func, CudaBlocksPerGrid, /* gridDimY
*/ 1,
                         /* gridDimZ */ 1, CudaThreadsPerBlock,
                         /* blockDimY */ 1, /* blockDimZ */ 1,
                         /* sharedMemBytes */ 0, Stream, &Args[0], nullptr);
    if (!checkResult(Err, "Error returned from cuLaunchKernel\n"))
      return OFFLOAD_FAIL;

    Err = cuCtxSynchronize();
    if (!checkResult(Err, "Error after cuLaunchKernel\n"))
      return OFFLOAD_FAIL;

Without the mutex. the first error from all the threads is "Error after
cuLaunchKernel". After enabling the mutex, the code runs well. If I move
the mutex unlock() before the cuCtxSynchronize, the code still runs into
error. So I think the error comes from kernel execution not something else.

2) on my application side, I tried to see what triggers the error. Once I
commented out "omp parallel" in the hierarchical parallelism.
Multi host threads offloading start to run without error.

I'd like to ask the wider community to see potential causes of this bug.
Is there a thread unsafe state machine with a kernel generated from a
target region? Or any other potential reason for the error?

Ye Luo, Ph.D.
Computational Science Division & Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20200616/c377e3f4/attachment.html>

More information about the Openmp-dev mailing list