[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

Tue Aug 14 07:18:51 PDT 2018

gtbercea added a comment.

Thanks @Hahnfeld for your suggestions.

Unfortunately doing the lowering in the backend one would need to replace the math function calls with calls to libdevice function calls. I have not been able to do that in an elegant way. Encoding the interface to libdevice is just not a clean process not to mention that any changes to libdevice will have to be tracked manually with every new CUDA version. It does not make the code more maintainable, on the contrary I think it makes it harder to track libdevice changes.

On the same note, clang-cuda doesn't do the pow(a,2) -> a*a optimization, I checked. It is something that needs to be fixed for Clang-CUDA first before OpenMP can make use of it. OpenMP-NVPTX toolchain is designed to exist on top of the CUDA toolchain. It therefore inherits all the clang-cuda benefits and in this particular case, limitations.

As for the Sema check error you report (the one related to the x restriction), I think the fix you proposed is good and should be pushed in a separate patch.

Repository:
  rC Clang

https://reviews.llvm.org/D47849