[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

Tue Feb 19 11:03:52 PST 2019

tra added a subscriber: rsmith.
tra added a comment.

In D56411#1400300 <https://reviews.llvm.org/D56411#1400300>, @rjmccall wrote:

> Okay, but it's not great design to have a kind of overloading that can't be resolved to an exact intended declaration even by an explicit cast.  That's why I think making *optional* host/device typing is a good idea.  And I strongly want to caution you against doing language design by just incrementally hacking at the compiler to progressively make more test-cases work, which is what it feels like you're doing.

+1. IMO for templates to work sensibly in this situations `__host__` / `__device__` must be part of the type.

I.e. extending the example above,

  __host__ int f() { return 1;}
  __device__ int f() { return 2;}
  template<typename int (*F)()> __kernel__ void t() { F(); }
  __host__ void g() { t<f><<<1,1>>>(); }
  __global__ void g() { t<f><<<1,1>>>(); } // technically legal in CUDA, though clang does not support it yet.

IMO, t<f> in `__host__` g() should be different from t<f> in `__device__` g(). Which implies that 'device-ness' must be part of the F's type so we would have two different instantiations, which is what we want to see in the AST.
Calling context if somewhat irrelevant for template instantiations. E.g. one could've explicitly instantiated the template in the global scope.

@rsmith Any suggestions how we could deal with this situation in a principled way?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411