[Openmp-commits] [openmp] 17b5445 - [Libomptarget] Add a wavefront sync builtin for the AMDGPU implementation (#70228)
via Openmp-commits
openmp-commits at lists.llvm.org
Wed Oct 25 12:27:18 PDT 2023
Author: Joseph Huber
Date: 2023-10-25T14:27:14-05:00
New Revision: 17b5445996c993057824b7142905b48ed67292b3
URL: https://github.com/llvm/llvm-project/commit/17b5445996c993057824b7142905b48ed67292b3
DIFF: https://github.com/llvm/llvm-project/commit/17b5445996c993057824b7142905b48ed67292b3.diff
LOG: [Libomptarget] Add a wavefront sync builtin for the AMDGPU implementation (#70228)
Summary:
While this is technically a no-op for AMDGPU hardware, in cases where
the user would see fit to add an explicit wavefront sync on Nvidia
hardware, we should also inform the LLVM optimizer that this control
flow is convergent so we do not reorder blocks.
Added:
Modified:
openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
Removed:
################################################################################
diff --git a/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp b/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
index b9a192f0d84df9a..ad3b1cad4194364 100644
--- a/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
+++ b/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
@@ -272,7 +272,9 @@ void fenceSystem(atomic::OrderingTy Ordering) {
}
void syncWarp(__kmpc_impl_lanemask_t) {
- // AMDGCN doesn't need to sync threads in a warp
+ // This is a no-op on current AMDGPU hardware but it is used by the optimizer
+ // to enforce convergent behaviour between control flow graphs.
+ __builtin_amdgcn_wave_barrier();
}
void syncThreads(atomic::OrderingTy Ordering) {
More information about the Openmp-commits
mailing list