[PATCH] D80450: [CUDA][HIP] Fix implicit HD function resolution
Yaxun Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Nov 26 08:42:32 PST 2020
yaxunl added a comment.
In D80450#2088129 <https://reviews.llvm.org/D80450#2088129>, @tra wrote:
> In D80450#2087938 <https://reviews.llvm.org/D80450#2087938>, @tra wrote:
>
>> Reproducer for the regression. https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575
>> It's not particularly small, but that's as far as I could get it reduced.
>>
>> With the patch, an attempt to instantiate `ag` on line 36 (in the reproducer sources I linked to above) results in ambiguity between two templates on lines 33 and 24 that are in different namespaces.
>> Previously it picked the template on line 28.
>
> Managed to simplify the reproducer down to this which now reports that a host candidate has been ignored. This may explain why we ended up with the ambiguity when other overloads were present.
>
> template <typename> struct a {};
> namespace b {
> struct c : a<int> {};
> template <typename d> void ag(d);
> } // namespace b
> template <typename ae>
> __attribute__((host)) __attribute__((device)) int ag(a<ae>) {
> ae e;
> ag(e);
> }
> void f() { ag<b::c>; }
The error only happens in device compilation.
For the call `ag(e)`. There are two candidates:
1. `ag` in namespace `b`. The function arguments can match. However it is a host function, therefore is a wrong-sided candidate and not viable.
2. `ag` in default name space. It is a host device function. However the function arguments requires `a<ae>`, therefore cannot match.
Before my patch, wrong-sided candidate is allowed. clang resolves to candidate 1 and this results in a diagnostic about host function referenced in device host function, which can be deferred. Since `f()` is not emitted on device side, the deferred diags is not emitted.
After my patch, wrong-sided candidate is not allowed. clang resolves to candidate 2, which results in a diagnostic that no matching function, which is not a deferred diagnostics by default and is emitted even if `f()` is not emitted on device side.
In a sense, the diagnostic is correct, since `ag(a<ae>)` cannot be emitted on device side. This can be fixed by either make `ag(a<ae>)` a host function or make `ag(d)` a host device function.
In the original testcase (https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575)
Before my change, call at line 36 resolves to wrong-sided candidate at line 29 since that is the best match for argument types. This results in a deferred diag which allows device compilation to pass.
After my change, call at line 36 resolves to two host device candidates. This results in diagnostics about ambiguity which is not deferred by default. Therefore the compilation fails.
Basically it all boils down to the issue that overloading resolution diagnostics are not deferred by default.
I think first of all we need to exclude wrong-sided candidates as this patch does, otherwise we cannot have correct hostness based overloading resolution and fix bugs like https://bugs.llvm.org/show_bug.cgi?id=46922 .
However by doing this we changes the existing overloading resolution incur some overloading resolution diags. Unless we defer these diags, we may break some existing CUDA/HIP code.
Fortunately https://reviews.llvm.org/D84364 is already landed, which allows deferring overloading resolution diags under option -fgpu-defer-diags.
I think a conservative solution is that we keep the old overloading resolution behavior by default (i.e. do not exclude wrong-sided candidates), whereas enable the correct hostness based overloading resolution when -fgpu-defer-diags is on. If developers would like correct hostness based overloading resolution, they can use -fgpu-defer-diags. Then as -fgpu-defer-diags become stable, we turn it on by default.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80450/new/
https://reviews.llvm.org/D80450
More information about the cfe-commits
mailing list