[PATCH] D83591: [OpenMP][CUDA] Fix std::complex in GPU regions

Fri Jul 10 16:53:08 PDT 2020

jdoerfert added a comment.

In D83591#2145411 <https://reviews.llvm.org/D83591#2145411>, @tra wrote:

> In D83591#2145378 <https://reviews.llvm.org/D83591#2145378>, @JonChesterfield wrote:
>
> > Fine by me. Let's get nvptx working properly in tree now and work out how to wire up amdgcn subsequently. I'm sure a reasonable abstraction will present itself.
>
>
> I'm missing something -- what was wrong with the changes in D80897 <https://reviews.llvm.org/D80897> ?

It doesn't work for OpenMP. The problem is that we overload some of the math functions fine, e.g., `sin(float)` but not the template ones. So when the code below calls `copysign(int, double)` (or something similar), the OpenMP target variant overload is missing. I have template overload support locally but it needs tests and there is one issue I've seen. This was supposed to be a stopgap as it unblocks the OpenMP mode.

> AMD's HIP compilation already piggy-backs on using clang's C++ wrappers, so this change will likely break them now and I'll be the first in line to revert the change.

I did not know they are using __clang_cuda headers. (Site note, we should rename them then.)

> @yaxunl -- Sam, does this change affect HIP compilation? If it does, perhaps we should keep C++-based macro definitions around.

Sure, I can do this only in OpenMP mode and keep the proper C++ std functions in C++ mode. Does that sound good?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83591/new/

https://reviews.llvm.org/D83591