[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Aug 22 15:16:44 PDT 2018


tra added a comment.

`__clang_cuda_device_functions.h` is not intended to be a device-side math.h, despite having a lot of overlap/similarities. It may change at any time we get new CUDA version.
I would suggest writing an OpenMP-specific replacement for math.h which would map to whatever device-specific function OpenMP needs. For NVPTX that may be libdevice, for which you have declarations in `__clang_cuda_libdevice_declares.h`. Using part of `__clang_cuda_device_functions.h` may be a decent starting point for NVPTX, but OpenMP will likely need to provide an equivalent for other back-ends, too.



================
Comment at: lib/Basic/Targets/NVPTX.cpp:232
+  // getting inlined on the device.
+  Builder.defineMacro("__NO_MATH_INLINES");
 }
----------------
This relies on implementation detail of particular variant of the header file you're assuming all compilations will include. This is a workaround of the real problem (attempting to use headers from machine X while targeting Y) at best.

D50845 is dealing with the issue of headers for target code. Hopefully, they'll find a way to provide device-specific headers, so you don't rely on host headers being parseable during device-side compilation.


================
Comment at: lib/Driver/ToolChains/Clang.cpp:4758
+    // toolchain.
+    CmdArgs.push_back("-fno-math-builtin");
   }
----------------
Could you elaborate on why you don't want the builtins?
Builtins are enabled and are useful for CUDA. What makes their use different for OpenMP?
Are you doing it to guarantee that math functions remain unresolved in IR so you could link them in from external bitcode?



Repository:
  rC Clang

https://reviews.llvm.org/D47849





More information about the cfe-commits mailing list