[all-commits] [llvm/llvm-project] 18ce3d: [OpenMP][Offloading] Fix data race in data mapping...

Shilei Tian via All-commits all-commits at lists.llvm.org
Fri Jul 23 13:11:05 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 18ce3d3f2c362b7fda33ebd7b4d544e9cae23ad4
      https://github.com/llvm/llvm-project/commit/18ce3d3f2c362b7fda33ebd7b4d544e9cae23ad4
  Author: Shilei Tian <tianshilei1992 at gmail.com>
  Date:   2021-07-23 (Fri, 23 Jul 2021)

  Changed paths:
    M openmp/libomptarget/src/device.cpp
    M openmp/libomptarget/src/device.h
    M openmp/libomptarget/src/omptarget.cpp

  Log Message:
  -----------
  [OpenMP][Offloading] Fix data race in data mapping by using two locks

This patch tries to partially fix one of the two data race issues reported in
[1] by introducing a per-entry mutex. Additional discussion can also be found in
D104418, which will also be refined to fix another data race problem.

Here is how it works. Like before, `DataMapMtx` is still being used for mapping
table lookup and update. In any case, we will get a table entry. If we need to
make a data transfer (update the data on the device), we need to lock the entry
right before releasing `DataMapMtx`, and the issue of data transfer should be
after releasing `DataMapMtx`, and the entry is unlocked afterwards. This can
guarantee that: 1) issue of data movement is not in critical region, which will
not affect performance too much, and also will not affect other threads that don't
touch the same entry; 2) if another thread accesses the same entry, the state of
data movement is consistent (which requires that a thread must first get the
update lock before getting data movement information).

For a target that doesn't support async data transfer, issue of data movement is
data transfer. This two-lock design can potentially improve concurrency compared
with the design that guards data movement with `DataMapMtx` as well. For a target
that supports async data movement, we could simply attach the event between the
issue of data movement and unlock the entry. For a thread that wants to get the
event, it must first get the lock. This can also get rid of the busy wait until
the event pointer is valid.

Reference:
[1] https://bugs.llvm.org/show_bug.cgi?id=49940

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D104555




More information about the All-commits mailing list