[all-commits] [llvm/llvm-project] ae9996: [OpenMP] Enable automatic unified shared memory on...
carlobertolli via All-commits
all-commits at lists.llvm.org
Mon Jan 22 08:30:33 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: ae99966a279601022d2b4d61dfbec349f7d65c12
https://github.com/llvm/llvm-project/commit/ae99966a279601022d2b4d61dfbec349f7d65c12
Author: carlobertolli <carlo.bertolli at amd.com>
Date: 2024-01-22 (Mon, 22 Jan 2024)
Changed paths:
M openmp/libomptarget/include/Shared/PluginAPI.h
M openmp/libomptarget/include/Shared/PluginAPI.inc
M openmp/libomptarget/include/Shared/Requirements.h
M openmp/libomptarget/include/device.h
M openmp/libomptarget/plugins-nextgen/amdgpu/dynamic_hsa/hsa_ext_amd.h
M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
M openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
M openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
M openmp/libomptarget/src/OpenMP/Mapping.cpp
M openmp/libomptarget/src/PluginManager.cpp
M openmp/libomptarget/src/device.cpp
A openmp/libomptarget/test/mapping/auto_zero_copy.cpp
Log Message:
-----------
[OpenMP] Enable automatic unified shared memory on MI300A. (#77512)
This patch enables applications that did not request OpenMP
unified_shared_memory to run with the same zero-copy behavior, where
mapped memory does not result in extra memory allocations and memory
copies, but CPU-allocated memory is accessed from the device. The name
for this behavior is "automatic zero-copy" and it relies on detecting:
that the runtime is running on a MI300A, that the user did not select
unified_shared_memory in their program, and that XNACK (unified memory
support) is enabled in the current GPU configuration. If all these
conditions are met, then automatic zero-copy is triggered.
This patch also introduces an environment variable OMPX_APU_MAPS that,
if set, triggers automatic zero-copy also on non APU GPUs (e.g., on
discrete GPUs).
This patch is still missing support for global variables, which will be
provided in a subsequent patch.
Co-authored-by: Thorsten Blass <thorsten.blass at amd.com>
More information about the All-commits
mailing list