[all-commits] [llvm/llvm-project] 67fed1: [OpenMP] Ensure memory fences are created with bar...

Johannes Doerfert via All-commits all-commits at lists.llvm.org
Mon Apr 17 15:28:29 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 67fed132f39c81e8006c4463ab1f173fea5e4e4b
      https://github.com/llvm/llvm-project/commit/67fed132f39c81e8006c4463ab1f173fea5e4e4b
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2023-04-17 (Mon, 17 Apr 2023)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/include/Synchronization.h
    M openmp/libomptarget/DeviceRTL/src/Kernel.cpp
    M openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
    M openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
    A openmp/libomptarget/test/offloading/barrier_fence.c

  Log Message:
  -----------
  [OpenMP] Ensure memory fences are created with barriers for AMDGPUs

It turns out that the __builtin_amdgcn_s_barrier() alone does not emit
a fence. We somehow got away with this and assumed it would work as it
(hopefully) is correct on the NVIDIA path where we just emit a
__syncthreads. After talking to @arsenm we now (mostly) align with the
OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs.

It seems this was the underlying cause for #59759, but I am not 100%
certain. There is a chance this simply hides the problem.

Fixes: https://github.com/llvm/llvm-project/issues/59759

[1] https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/07b347366eb2c6ebc3414af323c623cbbbafc854/opencl/src/workgroup/wgbarrier.cl#L21




More information about the All-commits mailing list