[PATCH] D82324: [OPENMP]Dynamic globalization for parallel target regions.
Alexey Bataev via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Jun 24 13:01:47 PDT 2020
ABataev marked an inline comment as done.
ABataev added a comment.
In D82324#2112388 <https://reviews.llvm.org/D82324#2112388>, @jdoerfert wrote:
> Let me rephrase. Does the user needs to request the fast path or the user needs to request the slow but correct path? Only the former is acceptable IMHO.
By default, the universal, but slower option is enabled. If the user is sure that there is no parallel target regions in his code, he can compile with `fno-openmp-cuda-parallel-target-regions` to get better performance. I.e. `fopenmp-cuda-parallel-target-regions` is enabled by default (slow, but reliable).
================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5250
+ options::OPT_fno_openmp_cuda_parallel_target_regions,
+ /*Default=*/true))
+ CmdArgs.push_back("-fopenmp-cuda-parallel-target-regions");
----------------
The slow but reliable option is enabled by default here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D82324/new/
https://reviews.llvm.org/D82324
More information about the cfe-commits
mailing list