[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls
Gheorghe-Teodor Bercea via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Aug 23 08:16:43 PDT 2018
gtbercea added inline comments.
================
Comment at: lib/Basic/Targets/NVPTX.cpp:232
+ // getting inlined on the device.
+ Builder.defineMacro("__NO_MATH_INLINES");
}
----------------
tra wrote:
> This relies on implementation detail of particular variant of the header file you're assuming all compilations will include. This is a workaround of the real problem (attempting to use headers from machine X while targeting Y) at best.
>
> D50845 is dealing with the issue of headers for target code. Hopefully, they'll find a way to provide device-specific headers, so you don't rely on host headers being parseable during device-side compilation.
I agree. The proper fix would be what the other patch is attempting to do.
================
Comment at: lib/Driver/ToolChains/Clang.cpp:4758
+ // toolchain.
+ CmdArgs.push_back("-fno-math-builtin");
}
----------------
tra wrote:
> Could you elaborate on why you don't want the builtins?
> Builtins are enabled and are useful for CUDA. What makes their use different for OpenMP?
> Are you doing it to guarantee that math functions remain unresolved in IR so you could link them in from external bitcode?
>
That's right. I don't particularly like this approach as this leads to OpenMP-NVPTX toolchain missing out on optimizations such as replacing math function call with basic operations ( pow(a,2) -> a*a for example).
I am trying to fix this in a future patch by allowing intrinsics/builtins to propagate.
Repository:
rC Clang
https://reviews.llvm.org/D47849
More information about the cfe-commits
mailing list