[PATCH] D113602: [OpenMP] Fix master thread barrier for Pascal and amdgpu

Joel E. Denny via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 11 16:51:52 PST 2021


jdenny updated this revision to Diff 386680.
jdenny retitled this revision from "[OpenMP] Fix master thread barrier for Pascal" to "[OpenMP] Fix master thread barrier for Pascal and amdgpu".
jdenny added a comment.
Herald added subscribers: llvm-commits, ormris, hiraditya, t-tye, tpr, dstuttard, wdng, kzhuravl.
Herald added a project: LLVM.

- Applied the same fix to the custom state machine, as suggested by @jdoerfert privately, and extended the new test to cover it.  For that test on the NVIDIA Pascals I tried, fixing the custom state machine didn't appear to be needed.  Perhaps in that version, the master thread manages to be selected for execution before other threads in its warp.  However, fixing the custom state machine did prove important for that test on an AMD GPU I tried.  Maybe another test would prove it's important for Pascals too, but I haven't looked for one.
- Moved fix to callers of generic state machine functions, as suggested by @tianshilei1992.
- Pointed out this fix is also relevant to AMD GPUs, as suggested by @JonChesterfield.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113602/new/

https://reviews.llvm.org/D113602

Files:
  llvm/lib/Transforms/IPO/OpenMPOpt.cpp
  openmp/libomptarget/DeviceRTL/src/Kernel.cpp
  openmp/libomptarget/deviceRTLs/common/src/omptarget.cu
  openmp/libomptarget/deviceRTLs/common/src/support.cu
  openmp/libomptarget/deviceRTLs/target_interface.h
  openmp/libomptarget/test/offloading/bug51781.c

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D113602.386680.patch
Type: text/x-patch
Size: 9178 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211112/89416296/attachment-0001.bin>


More information about the llvm-commits mailing list