[llvm] [OpenMP] Remove 'omp assumes' scopes now that we have no inline ASM (PR #123611)

Joseph Huber via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 20 06:04:46 PST 2025


https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/123611

>From 8cc8157ad45c9cff2de0aa06e254d90b65e75db4 Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Mon, 20 Jan 2025 07:45:46 -0600
Subject: [PATCH 1/2] [OpenMP] Remove 'omp assumes' scopes now that we have no
 inline ASM

Summary:
We used this globally scoped `ext_no_call_asm` as a sort of hack around
the compiler that allowed the attributor to optimize out inline assembly
calls to PTX instructions. Quite some time ago I got rid of every inline
assembly call and replaced it with a builitin, so this can just be
deleted.

Furthermore, I use the `[[omp::assume]]` attribute directly for the
aligned barrier usage. This prints an unknown assumption warning (even
though it isn't) so I'm just silencing that for now until I fix it
later.
---
 offload/DeviceRTL/CMakeLists.txt            | 1 +
 offload/DeviceRTL/include/DeviceTypes.h     | 9 ---------
 offload/DeviceRTL/include/Synchronization.h | 5 ++---
 3 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 099634e211e7a7..3f647304b06f85 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -100,6 +100,7 @@ set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
              -nocudalib -nogpulib -nogpuinc -nostdlibinc
              -fopenmp -fopenmp-cuda-mode
              -Wno-unknown-cuda-version -Wno-openmp-target
+             -Wno-unknown-assumption
              -DOMPTARGET_DEVICE_RUNTIME
              -I${include_directory}
              -I${devicertl_base_directory}/../include
diff --git a/offload/DeviceRTL/include/DeviceTypes.h b/offload/DeviceRTL/include/DeviceTypes.h
index 259bc008f91d13..1cd044f432e569 100644
--- a/offload/DeviceRTL/include/DeviceTypes.h
+++ b/offload/DeviceRTL/include/DeviceTypes.h
@@ -15,15 +15,6 @@
 #include <stddef.h>
 #include <stdint.h>
 
-// Tell the compiler that we do not have any "call-like" inline assembly in the
-// device rutime. That means we cannot have inline assembly which will call
-// another function but only inline assembly that performs some operation or
-// side-effect and then continues execution with something on the existing call
-// stack.
-//
-// TODO: Find a good place for this
-#pragma omp assumes ext_no_call_asm
-
 enum omp_proc_bind_t {
   omp_proc_bind_false = 0,
   omp_proc_bind_true = 1,
diff --git a/offload/DeviceRTL/include/Synchronization.h b/offload/DeviceRTL/include/Synchronization.h
index a4d4fc08837b29..96a2f8654e92ab 100644
--- a/offload/DeviceRTL/include/Synchronization.h
+++ b/offload/DeviceRTL/include/Synchronization.h
@@ -192,15 +192,14 @@ void threads(atomic::OrderingTy Ordering);
 /// noinline is removed by the openmp-opt pass and helps to preserve the
 /// information till then.
 ///{
-#pragma omp begin assumes ext_aligned_barrier
 
 /// Synchronize all threads in a block, they are reaching the same instruction
 /// (hence all threads in the block are "aligned"). Also perform a fence before
 /// and after the barrier according to \p Ordering. Note that the
 /// fence might be part of the barrier if the target offers this.
-[[gnu::noinline]] void threadsAligned(atomic::OrderingTy Ordering);
+[[gnu::noinline, omp::assume("ext_aligned_barrier")]] void
+threadsAligned(atomic::OrderingTy Ordering);
 
-#pragma omp end assumes
 ///}
 
 } // namespace synchronize

>From aae9eb78f18caf21173efcf7d59229d4a29a3f8f Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Mon, 20 Jan 2025 08:04:38 -0600
Subject: [PATCH 2/2] Update offload/DeviceRTL/CMakeLists.txt

Co-authored-by: Michael Kruse <github at meinersbur.de>
---
 offload/DeviceRTL/CMakeLists.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 3f647304b06f85..98b063a62530c0 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -100,7 +100,7 @@ set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
              -nocudalib -nogpulib -nogpuinc -nostdlibinc
              -fopenmp -fopenmp-cuda-mode
              -Wno-unknown-cuda-version -Wno-openmp-target
-             -Wno-unknown-assumption
+             -Wno-unknown-assumption  # TODO: Fix false-positive warning for ext_aligned_barrier
              -DOMPTARGET_DEVICE_RUNTIME
              -I${include_directory}
              -I${devicertl_base_directory}/../include



More information about the llvm-commits mailing list