[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

Thu Feb 14 11:28:29 PST 2019

yaxunl added a comment.

In D56411#1398291 <https://reviews.llvm.org/D56411#1398291>, @tra wrote:

> >> That said, does CUDA have a general rule resolving `__host__` vs. `__device__` overloads based on context?  And does it allow overloading based solely on `__host__` vs. `__device__`?
>
> NVCC does not. Clang does. See https://goo.gl/EXnymm for the details.
>
> AFAICT, NVIDIA is starting to consider adopting Clang's approach:
>  http://lists.llvm.org/pipermail/cfe-dev/2018-November/060070.html (original message from Bryce apparently didn't make it to the cfe-dev archive)

So my concern about checking host/device compatibility in template instantiation is still valid.

I verified the following code is valid with clang

  #define __device__ __attribute__((device))

  __device__ void f();

  void f();

  __device__ void g() {
    f();
  }

  template<void (*F)()> __device__ void t() {
    F();
  }

  __device__ void h() {
    t<f>();
  }

To be able to resolve function type template argument based on host/device attribute, we need to do the check before template instantiation.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411