[Openmp-commits] [PATCH] D104418: [OpenMP][Offloading] Fixed data race in libomptarget caused by async data movement

Shilei Tian via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Jun 30 10:20:59 PDT 2021


tianshilei1992 added a comment.

In D104418#2850555 <https://reviews.llvm.org/D104418#2850555>, @gValarini wrote:

> Reading the event-based solution, one corner case came up to my mind (it is odd, but possible):
>
>   #pragma omp target map(to: A, B) nowait
>   { /* read A and B */ }
>   
>   #pragma omp target map(to: B, A) nowait
>   { /* read A and B */ }
>
> Note that the only difference between them (besides their internal code) is the ordering of the buffers in the mapping list.
>
> Since both regions just read A and B, there is no need for a dependency between them (from the programmers perspective) and, thus, they can be executed by two different threads as follows:
>
> - T1: alloc A -> issue A submission  -> create event for A -> alloc B (not new, MoveData = false)  -> wait for B event indefinitly ...
> - T2: alloc B -> issue B submission  -> create event for B -> alloc A (not new, MoveData = false)  -> wait for A event indefinitly ...

The case you mentioned is not a problem. Say we have two queues for the two target regions. So the contents of each queue is:
Q1: data movement of A, notification of event E1, wait on E2, kernel issue.
Q2: data movement of B, notification of event E2, wait on E1, kernel issue.
You can see the notification of the two events always before the wait. No matter what execution orders they will be, it will not dead lock.

> 1. Can a plugin implementation defer any command execution until the agnostic layer calls the device synchronization function (`__tgt_rtl_synchronize`)? **If yes, then the above problem can occur since no synchronization call is done.**

I'm not sure I follow. Plugin will never defer any operation. If the target supports async operations, the plugin just issues the operation, and that's it. As for whether the device will defer or not, it's plugin transparent.

> 2. Should a plugin implementation indirectly synchronize the queue where an event was created when a call to `__tgt_rtl_wait_event` is made? **If yes, then the problem above won't happen, but this should be clarified in the documentation of the API.**

Creating an event just needs to "take a snapshot" of current queue. It doesn't need to synchronize. We just need to make sure that:

- The event will be notified only if all operations in the "snapshot" are finished, or say all operations before it are finished.
- If any operation depends on the event, it will not be executed until the event is fulfilled/notified.

> 3. Are the variables in the mapping list ordered somehow? **If that is the case, then the (possible) problem described above will never happen as well (like resolving it through lock ordering).**

That's a good question. I don't have a clear answer for that. I remember the front end will sort somehow, but don't know in what order.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104418/new/

https://reviews.llvm.org/D104418



More information about the Openmp-commits mailing list