[all-commits] [llvm/llvm-project] f9a89e: [OpenMP][FIX] Allocate per launch memory for GPU t...

Wed Nov 1 11:12:01 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f9a89e6b9c4345df978bf7cbfedfd2b250029278
      https://github.com/llvm/llvm-project/commit/f9a89e6b9c4345df978bf7cbfedfd2b250029278
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2023-11-01 (Wed, 01 Nov 2023)

  Changed paths:
    M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
    M clang/lib/CodeGen/CGOpenMPRuntimeGPU.h
    M clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
    M clang/test/OpenMP/target_teams_generic_loop_codegen.cpp
    M openmp/libomptarget/DeviceRTL/include/Interface.h
    M openmp/libomptarget/DeviceRTL/src/Reduction.cpp
    M openmp/libomptarget/include/Environment.h
    M openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.cpp
    A openmp/libomptarget/test/offloading/parallel_target_teams_reduction.cpp

  Log Message:
  -----------
  [OpenMP][FIX] Allocate per launch memory for GPU team reductions (#70752)

We used to perform team reduction on global memory allocated in the
runtime and by clang. This was racy as multiple instances of a kernel,
or different kernels with team reductions, would use the same locations.
Since we now have the kernel launch environment, we can allocate dynamic
memory per-launch, allowing us to move all the state into a non-racy
place.

Fixes: https://github.com/llvm/llvm-project/issues/70249