[PATCH] D77954: [CUDA][HIP] Fix overload resolution issue for device host functions

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Apr 11 14:24:19 PDT 2020


yaxunl added a comment.

In D77954#1976316 <https://reviews.llvm.org/D77954#1976316>, @yaxunl wrote:

> In D77954#1976294 <https://reviews.llvm.org/D77954#1976294>, @rjmccall wrote:
>
> > If `nvcc` ignores host/device-ness when selecting overloads, that's probably the specified behavior, right?  I agree that it would be better to not ignore it, but Clang shouldn't just make up better rules for languages with external specifications.
>
>
> cuda-clang does not always follow nvcc's behavior. For example, cuda-clang only allows incomplete array type for extern shared variables, whereas nvcc allows other types. If cuda-clang is supposed to follow nvcc's behavior in every aspects, we should approve https://reviews.llvm.org/D73979 , but it is not the case.
>
> Therefore, I think we should discuss whether this is really a bug, and whether the fix can cause any unwanted side effect.


BTW cuda-clang is already quite different than nvcc regarding host/device-based overloading resolution. For example, the following code is valid in cuda-clang before my change but invalid in nvcc https://cuda.godbolt.org/z/qwpKZe . So if we want to follow nvcc's resolution rule we need a total revamp of device/host related resolution in cuda-clang.

  __host__ int foo(int x) {
       return 1;
  }
  
  template<class T>
  __device__ int foo(T x) {
      return 2;
  }
  
  __device__ int bar() {
      return foo(1);
  }
  
  __global__ void test(int *a) {
      *a = bar();
  }


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77954/new/

https://reviews.llvm.org/D77954





More information about the cfe-commits mailing list