[PATCH] D80450: [CUDA][HIP] Fix HD function resolution

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 30 12:01:08 PST 2020


tra added a comment.

In D80450#2418775 <https://reviews.llvm.org/D80450#2418775>, @yaxunl wrote:

> For the call `ag(e)`. There are two candidates:
>
> 1. `ag` in namespace `b`. The function arguments can match. However it is a host function, therefore is a wrong-sided candidate and not viable.
>
> 2. `ag` in default name space. It is a host device function. However the function arguments requires `a<ae>`, therefore cannot match.
>
> Before my patch, wrong-sided candidate is allowed. clang resolves to candidate 1 and this results in a diagnostic about host function referenced in device host function, which can be deferred. Since `f()` is not emitted on device side, the deferred diags is not emitted.

This used to be a fairly common pattern in existing CUDA code. A lot of templated code had `__host__ __device__` slapped on it because NVCC had no target overloading and it's hard to control who/where/how will instantiate particular template. Some of those templates could only be instantiated on one side of the compilation. Clang's only choice was to allow the wrong-side candidates and/or defer the diagnostics. I vaguely recall that it was one of the trickier bits of the overload resolution rules to handle.

> After my patch, wrong-sided candidate is not allowed. clang resolves to candidate 2, which results in a diagnostic that no matching function, which is not a deferred diagnostics by default and is emitted even if `f()` is not emitted on device side.
> ...
> Basically it all boils down to the issue that overloading resolution diagnostics are not deferred by default.

Looks that way.

> I think a conservative solution is that we keep the old overloading resolution behavior by default (i.e. do not exclude wrong-sided candidates), whereas enable the correct hostness based overloading resolution when -fgpu-defer-diags is on. If developers would like correct hostness based overloading resolution, they can use -fgpu-defer-diags. Then as -fgpu-defer-diags become stable, we turn it on by default.

SGTM. I'll check how the patch fares on our CUDA code.



================
Comment at: clang/test/SemaCUDA/function-overload.cu:614
+
+// Two HD candidates competes with H candidate.
+// HDs have type mismatch whereas H has type match.
----------------
competes->compete


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80450/new/

https://reviews.llvm.org/D80450



More information about the cfe-commits mailing list