[all-commits] [llvm/llvm-project] ae9996: [OpenMP] Enable automatic unified shared memory on...

carlobertolli via All-commits all-commits at lists.llvm.org
Mon Jan 22 08:30:33 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: ae99966a279601022d2b4d61dfbec349f7d65c12
      https://github.com/llvm/llvm-project/commit/ae99966a279601022d2b4d61dfbec349f7d65c12
  Author: carlobertolli <carlo.bertolli at amd.com>
  Date:   2024-01-22 (Mon, 22 Jan 2024)

  Changed paths:
    M openmp/libomptarget/include/Shared/PluginAPI.h
    M openmp/libomptarget/include/Shared/PluginAPI.inc
    M openmp/libomptarget/include/Shared/Requirements.h
    M openmp/libomptarget/include/device.h
    M openmp/libomptarget/plugins-nextgen/amdgpu/dynamic_hsa/hsa_ext_amd.h
    M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
    M openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
    M openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
    M openmp/libomptarget/src/OpenMP/Mapping.cpp
    M openmp/libomptarget/src/PluginManager.cpp
    M openmp/libomptarget/src/device.cpp
    A openmp/libomptarget/test/mapping/auto_zero_copy.cpp

  Log Message:
  -----------
  [OpenMP] Enable automatic unified shared memory on MI300A. (#77512)

This patch enables applications that did not request OpenMP
unified_shared_memory to run with the same zero-copy behavior, where
mapped memory does not result in extra memory allocations and memory
copies, but CPU-allocated memory is accessed from the device. The name
for this behavior is "automatic zero-copy" and it relies on detecting:
that the runtime is running on a MI300A, that the user did not select
unified_shared_memory in their program, and that XNACK (unified memory
support) is enabled in the current GPU configuration. If all these
conditions are met, then automatic zero-copy is triggered.

This patch also introduces an environment variable OMPX_APU_MAPS that,
if set, triggers automatic zero-copy also on non APU GPUs (e.g., on
discrete GPUs).
This patch is still missing support for global variables, which will be
provided in a subsequent patch.

Co-authored-by: Thorsten Blass <thorsten.blass at amd.com>




More information about the All-commits mailing list