[clang] [CUDA][HIP] Fix deduction guide (PR #69366)

Yaxun Liu via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 15 12:20:51 PST 2023


yxsamliu wrote:

> > @ldionne - Can you take a look if that would have unintended consequences for libc++?
> 
> Honestly, I don't know. I don't know CUDA nearly well enough to understand all the implications here. All I know is that this seems to be a pretty significant "fork" of C++ in terms of its semantics, and the likelihood that everything will just happen to work as designed is kinda small (but hopefully it does). In my (uneducated) opinion, host vs device should probably be handled closer to a link-time failure. That way you'd steer clear of any complicated front-end concepts like SFINAE, overload resolution and all the stuff that is incredibly complicated in C++. If you modify any of the rules there, the likelihood of introducing issues is really large IMO.

Since

> > @ldionne - Can you take a look if that would have unintended consequences for libc++?
> 
> Honestly, I don't know. I don't know CUDA nearly well enough to understand all the implications here. All I know is that this seems to be a pretty significant "fork" of C++ in terms of its semantics, and the likelihood that everything will just happen to work as designed is kinda small (but hopefully it does). In my (uneducated) opinion, host vs device should probably be handled closer to a link-time failure. That way you'd steer clear of any complicated front-end concepts like SFINAE, overload resolution and all the stuff that is incredibly complicated in C++. If you modify any of the rules there, the likelihood of introducing issues is really large IMO.

I do agree that the further we could defer host/device-based overloading resolution the better. However, I doubt we could avoid host/device-based overloading resolution without breaking the existing CUDA/HIP code.

The reason is that we need to have correct overloading resolution to create the correct AST, especially when there is template instantiation. When we resolve overloaded functions, the host function candidate and device function candidate can have different signature. If we do not consider host/device attributes, we could end up calling a host function on device side if it has better match for argument types. Then the subsequent AST creation is all wrong.

To be able to avoid the host/device-based overloading resolution, we have to restrict overloading so that ignoring host/device-attributes do not affect the created AST. For example, we only allow host device functions, or we request each host function must have a corresponding device function with the same signature. We could add an extension for CUDA/HIP to request host/device overloading satisfy this restriction. I can see lots of things can be simplified with this extension.

However, for normal CUDA/HIP code, I don't think we can avoid host/device-based overloading resolution.

https://github.com/llvm/llvm-project/pull/69366


More information about the cfe-commits mailing list