[Openmp-commits] [openmp] 17b5445 - [Libomptarget] Add a wavefront sync builtin for the AMDGPU implementation (#70228)

via Openmp-commits openmp-commits at lists.llvm.org
Wed Oct 25 12:27:18 PDT 2023


Author: Joseph Huber
Date: 2023-10-25T14:27:14-05:00
New Revision: 17b5445996c993057824b7142905b48ed67292b3

URL: https://github.com/llvm/llvm-project/commit/17b5445996c993057824b7142905b48ed67292b3
DIFF: https://github.com/llvm/llvm-project/commit/17b5445996c993057824b7142905b48ed67292b3.diff

LOG: [Libomptarget] Add a wavefront sync builtin for the AMDGPU implementation (#70228)

Summary:
While this is technically a no-op for AMDGPU hardware, in cases where
the user would see fit to add an explicit wavefront sync on Nvidia
hardware, we should also inform the LLVM optimizer that this control
flow is convergent so we do not reorder blocks.

Added: 
    

Modified: 
    openmp/libomptarget/DeviceRTL/src/Synchronization.cpp

Removed: 
    


################################################################################
diff  --git a/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp b/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
index b9a192f0d84df9a..ad3b1cad4194364 100644
--- a/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
+++ b/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
@@ -272,7 +272,9 @@ void fenceSystem(atomic::OrderingTy Ordering) {
 }
 
 void syncWarp(__kmpc_impl_lanemask_t) {
-  // AMDGCN doesn't need to sync threads in a warp
+  // This is a no-op on current AMDGPU hardware but it is used by the optimizer
+  // to enforce convergent behaviour between control flow graphs.
+  __builtin_amdgcn_wave_barrier();
 }
 
 void syncThreads(atomic::OrderingTy Ordering) {


        


More information about the Openmp-commits mailing list