[llvm] [OpenMP] Remove 'omp assumes' scopes now that we have no inline ASM (PR #123611)
Joseph Huber via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 20 06:04:46 PST 2025
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/123611
>From 8cc8157ad45c9cff2de0aa06e254d90b65e75db4 Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Mon, 20 Jan 2025 07:45:46 -0600
Subject: [PATCH 1/2] [OpenMP] Remove 'omp assumes' scopes now that we have no
inline ASM
Summary:
We used this globally scoped `ext_no_call_asm` as a sort of hack around
the compiler that allowed the attributor to optimize out inline assembly
calls to PTX instructions. Quite some time ago I got rid of every inline
assembly call and replaced it with a builitin, so this can just be
deleted.
Furthermore, I use the `[[omp::assume]]` attribute directly for the
aligned barrier usage. This prints an unknown assumption warning (even
though it isn't) so I'm just silencing that for now until I fix it
later.
---
offload/DeviceRTL/CMakeLists.txt | 1 +
offload/DeviceRTL/include/DeviceTypes.h | 9 ---------
offload/DeviceRTL/include/Synchronization.h | 5 ++---
3 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 099634e211e7a7..3f647304b06f85 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -100,6 +100,7 @@ set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
-nocudalib -nogpulib -nogpuinc -nostdlibinc
-fopenmp -fopenmp-cuda-mode
-Wno-unknown-cuda-version -Wno-openmp-target
+ -Wno-unknown-assumption
-DOMPTARGET_DEVICE_RUNTIME
-I${include_directory}
-I${devicertl_base_directory}/../include
diff --git a/offload/DeviceRTL/include/DeviceTypes.h b/offload/DeviceRTL/include/DeviceTypes.h
index 259bc008f91d13..1cd044f432e569 100644
--- a/offload/DeviceRTL/include/DeviceTypes.h
+++ b/offload/DeviceRTL/include/DeviceTypes.h
@@ -15,15 +15,6 @@
#include <stddef.h>
#include <stdint.h>
-// Tell the compiler that we do not have any "call-like" inline assembly in the
-// device rutime. That means we cannot have inline assembly which will call
-// another function but only inline assembly that performs some operation or
-// side-effect and then continues execution with something on the existing call
-// stack.
-//
-// TODO: Find a good place for this
-#pragma omp assumes ext_no_call_asm
-
enum omp_proc_bind_t {
omp_proc_bind_false = 0,
omp_proc_bind_true = 1,
diff --git a/offload/DeviceRTL/include/Synchronization.h b/offload/DeviceRTL/include/Synchronization.h
index a4d4fc08837b29..96a2f8654e92ab 100644
--- a/offload/DeviceRTL/include/Synchronization.h
+++ b/offload/DeviceRTL/include/Synchronization.h
@@ -192,15 +192,14 @@ void threads(atomic::OrderingTy Ordering);
/// noinline is removed by the openmp-opt pass and helps to preserve the
/// information till then.
///{
-#pragma omp begin assumes ext_aligned_barrier
/// Synchronize all threads in a block, they are reaching the same instruction
/// (hence all threads in the block are "aligned"). Also perform a fence before
/// and after the barrier according to \p Ordering. Note that the
/// fence might be part of the barrier if the target offers this.
-[[gnu::noinline]] void threadsAligned(atomic::OrderingTy Ordering);
+[[gnu::noinline, omp::assume("ext_aligned_barrier")]] void
+threadsAligned(atomic::OrderingTy Ordering);
-#pragma omp end assumes
///}
} // namespace synchronize
>From aae9eb78f18caf21173efcf7d59229d4a29a3f8f Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Mon, 20 Jan 2025 08:04:38 -0600
Subject: [PATCH 2/2] Update offload/DeviceRTL/CMakeLists.txt
Co-authored-by: Michael Kruse <github at meinersbur.de>
---
offload/DeviceRTL/CMakeLists.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt
index 3f647304b06f85..98b063a62530c0 100644
--- a/offload/DeviceRTL/CMakeLists.txt
+++ b/offload/DeviceRTL/CMakeLists.txt
@@ -100,7 +100,7 @@ set(bc_flags -c -foffload-lto -std=c++17 -fvisibility=hidden
-nocudalib -nogpulib -nogpuinc -nostdlibinc
-fopenmp -fopenmp-cuda-mode
-Wno-unknown-cuda-version -Wno-openmp-target
- -Wno-unknown-assumption
+ -Wno-unknown-assumption # TODO: Fix false-positive warning for ext_aligned_barrier
-DOMPTARGET_DEVICE_RUNTIME
-I${include_directory}
-I${devicertl_base_directory}/../include
More information about the llvm-commits
mailing list