[PATCH] D82324: [OPENMP]Dynamic globalization for parallel target regions.

Alexey Bataev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jun 24 13:01:47 PDT 2020


ABataev marked an inline comment as done.
ABataev added a comment.

In D82324#2112388 <https://reviews.llvm.org/D82324#2112388>, @jdoerfert wrote:

> Let me rephrase. Does the user needs to request the fast path or the user needs to request the slow but correct path? Only the former is acceptable IMHO.


By default, the universal, but slower option is enabled. If the user is sure that there is no parallel target regions in his code, he can compile with `fno-openmp-cuda-parallel-target-regions` to get better performance. I.e. `fopenmp-cuda-parallel-target-regions` is enabled by default (slow, but reliable).



================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5250
+                       options::OPT_fno_openmp_cuda_parallel_target_regions,
+                       /*Default=*/true))
+        CmdArgs.push_back("-fopenmp-cuda-parallel-target-regions");
----------------
The slow but reliable option is enabled by default here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82324/new/

https://reviews.llvm.org/D82324





More information about the cfe-commits mailing list