[PATCH] D77954: [CUDA][HIP] Fix overload resolution issue for device host functions
Yaxun Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Sat Apr 11 14:24:19 PDT 2020
yaxunl added a comment.
In D77954#1976316 <https://reviews.llvm.org/D77954#1976316>, @yaxunl wrote:
> In D77954#1976294 <https://reviews.llvm.org/D77954#1976294>, @rjmccall wrote:
>
> > If `nvcc` ignores host/device-ness when selecting overloads, that's probably the specified behavior, right? I agree that it would be better to not ignore it, but Clang shouldn't just make up better rules for languages with external specifications.
>
>
> cuda-clang does not always follow nvcc's behavior. For example, cuda-clang only allows incomplete array type for extern shared variables, whereas nvcc allows other types. If cuda-clang is supposed to follow nvcc's behavior in every aspects, we should approve https://reviews.llvm.org/D73979 , but it is not the case.
>
> Therefore, I think we should discuss whether this is really a bug, and whether the fix can cause any unwanted side effect.
BTW cuda-clang is already quite different than nvcc regarding host/device-based overloading resolution. For example, the following code is valid in cuda-clang before my change but invalid in nvcc https://cuda.godbolt.org/z/qwpKZe . So if we want to follow nvcc's resolution rule we need a total revamp of device/host related resolution in cuda-clang.
__host__ int foo(int x) {
return 1;
}
template<class T>
__device__ int foo(T x) {
return 2;
}
__device__ int bar() {
return foo(1);
}
__global__ void test(int *a) {
*a = bar();
}
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D77954/new/
https://reviews.llvm.org/D77954
More information about the cfe-commits
mailing list