[clang] [OpenMP] Introduce -fopenmp-force-usm flag (PR #75468)
via cfe-commits
cfe-commits at lists.llvm.org
Thu Dec 14 04:25:54 PST 2023
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-driver
Author: Jan Patrick Lehr (jplehr)
<details>
<summary>Changes</summary>
The new flag implements logic to include `#pragma omp requires unified_shared_memory` in every translation unit.
This enables a straightforward way to enable USM for an application without the need to modify sources.
This is the flag mentioned in https://github.com/llvm/llvm-project/pull/75467
Once the test landed, I'll rebase and enable the test with this patch.
---
Full diff: https://github.com/llvm/llvm-project/pull/75468.diff
4 Files Affected:
- (modified) clang/include/clang/Driver/Options.td (+2)
- (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp (+14)
- (modified) clang/lib/Headers/CMakeLists.txt (+1)
- (added) clang/lib/Headers/openmp_wrappers/usm/force_usm.h (+6)
``````````diff
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 1b02087425b751..b9cd3043a13a9a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], "fopenmp-cuda-blocks-per-sm=">
Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], "fopenmp-cuda-teams-reduction-recs-num=">, Group<f_Group>,
Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
+def fopenmp_force_usm : Flag<["-"], "fopenmp-force-usm">, Group<f_Group>,
+ Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[CC1Option]>;
//===----------------------------------------------------------------------===//
// Shared cc1 + fc1 OpenMP Target Options
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index b012b7cb729378..2484a59085c276 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -129,6 +129,20 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList &Args) const {
void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
const ArgList &DriverArgs, ArgStringList &CC1Args) const {
HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+ CC1Args.push_back("-internal-isystem");
+ SmallString<128> P(HostTC.getDriver().ResourceDir);
+ llvm::sys::path::append(P, "include/cuda_wrappers");
+ CC1Args.push_back(DriverArgs.MakeArgString(P));
+
+ // Force APU mode will focefully include #pragma omp requires
+ // unified_shared_memory via the force_usm header
+ if (DriverArgs.hasArg(options::OPT_fopenmp_force_usm)) {
+ CC1Args.push_back("-include");
+ CC1Args.push_back(
+ DriverArgs.MakeArgString(HostTC.getDriver().ResourceDir +
+ "/include/openmp_wrappers/force_usm.h"));
+ }
}
void AMDGPUOpenMPToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index f8fdd402777e48..aac232fa8b4405 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -319,6 +319,7 @@ set(openmp_wrapper_files
openmp_wrappers/__clang_openmp_device_functions.h
openmp_wrappers/complex_cmath.h
openmp_wrappers/new
+ openmp_wrappers/usm/force_usm.h
)
set(llvm_libc_wrapper_files
diff --git a/clang/lib/Headers/openmp_wrappers/usm/force_usm.h b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
new file mode 100644
index 00000000000000..15c394e27ce9c2
--- /dev/null
+++ b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
@@ -0,0 +1,6 @@
+#ifndef __CLANG_FORCE_OPENMP_USM
+#define __CLANG_FORCE_OPENMP_USM
+
+#pragma omp requires unified_shared_memory
+
+#endif
``````````
</details>
https://github.com/llvm/llvm-project/pull/75468
More information about the cfe-commits
mailing list