[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 15 16:39:47 PST 2019


yaxunl added a comment.

In D56411#1398586 <https://reviews.llvm.org/D56411#1398586>, @rjmccall wrote:

> But what we've just been talking about is not a validity rule, it's an overload-resolution rule.  It's not *invalid* to use a device function as a template argument to a host function template (or to a class template, which of course is neither host nor device).  All you need to do is to resolve otherwise-intractable overload ambiguities by matching with the host-ness of the current context, which there's probably already code to do for when an overload set is used as e.g. a function argument.


OK I found the code for resolving the function type template argument. Basically CheckTemplateArgument calls ResolveAddressOfOverloadedFunction, which creates an AddressOfFunctionResolver. The constructor of AddressOfFunctionResolver calls AddMatchingNonTemplateFunctions to the candidate set, where host-ness of CUDA function is checked to decide whether a function is added as candidate

https://github.com/llvm-mirror/clang/blob/master/lib/Sema/SemaOverload.cpp#L11174

However, as shown in the above link, there is one issue on that line, which is better demonstrated by the follow testcase

  __host__ int f() { return 1;}
  __device__ int f() { return 2;}
  template<typename void (*F)()> __kernel__ void t() { F(); }
  __host__ void g() { t<f><<<1,1>>>(); } 

In t<f>, f should resolve to `__device__ f` since the true user of f is not g, but template t, or whatever is in t. Since t is a kernel, and kernel can only call device function, therefore we know that f should resolve to `__device__ f` instead of `__host__ f`.

However, currently clang resolves f to `__host__ f`, because it thinks the caller is S.CurContext, whereas S.CurContext is g.

The problem is that although f is reference in g, but it is not called by g. In this case, f is passed to a kernel template, and a kernel template can call device function, therefore f can be a device function.

The issue is that S.CurContext is not conveying the real caller or user of f in AddressOfFunctionResolver. To convey that information, a new member TemplateUser may need to be added to AddressOfFunctionResolver so that it knows that it is resolving a template argument and which template is using that argument.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56411/new/

https://reviews.llvm.org/D56411





More information about the cfe-commits mailing list