[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

Thu Aug 16 13:58:01 PDT 2018

tra added a comment.

In https://reviews.llvm.org/D50845#1203031, @gtbercea wrote:

> In https://reviews.llvm.org/D50845#1202991, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D50845#1202965, @Hahnfeld wrote:
> >
> > > In https://reviews.llvm.org/D50845#1202963, @hfinkel wrote:
> > >
> > > > As a result, we should really have a separate header that has those actually-available functions. When targeting NVPTX, why don't we have the included math.h be CUDA's math.h? In the end, those are the functions we need to call when we generate code. Right?
> > >
> > >
> > > That's what https://reviews.llvm.org/D47849 deals with.
> >
> >
> > Yes, but it doesn't get CUDA's math.h. Maybe I misunderstand how this works (and I very well might, because it's not clear that CUDA has a math.h by that name), but that patch tries to avoid problems with the host's math.h and then also injects __clang_cuda_device_functions.h into the device compilation. How does this compare to when you include math.h in Clang's CUDA mode? It seems to be that we want to somehow map standard includes, where applicable, to include files in CUDA's include/crt directory (e.g., crt/math_functions.h and crt/common_functions.h for stdio.h for printf), and nothing else ends up being available (because it is, in fact, not available).
>
>
> There's no CUDA specific math.h unless you want to regard clang_cuda_device_functions.h as a math header.

True. We rely on CUDA SDK which defines a subset of standard libc/libm functions with `__device__` attribute.

__clang_cuda_device_functions.h just provides a set of substitutes that became nvcc's builtins and are no longer implemented in CUDA headers.
It's not supposed to replace math.h and may change with next version of CUDA which may need to cope with some other quirk of CUDA's headers.

> The patch is using the same approach as CUDA and redirecting the function calls to device specific function calls. The parts of that patch which deal with host header compatibility would more naturally belong in a patch like this one so ultimately they won't be part of that patch. I'm currently working on improving the patch though by eliminating the clang_cuda_device_functions.h injection and eliminating the need to disable the built-ins.

This sounds great. When you do have device-side implementation of math library, it would probably worth considering to make CUDA use it, instead of the current hacks to adapt to CUDA headers. This would simplify things a bit and would give us much better control over the implementation.

Repository:
  rC Clang

https://reviews.llvm.org/D50845