[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

Yaxun Liu via cfe-commits cfe-commits at lists.llvm.org
Mon Jan 29 09:51:54 PST 2024


yxsamliu wrote:

> Reverted. I don't think there's a "proper" solution here since this seems to have leaked into the headers due to whoever set this up initially not properly setting these on the host. That seems to be endemic now, so the best we can do it just set it to some dummy values I think.

Ideally we should not emit them for host compilation and ask users to quote their device code depending on these macros with conditions that these macros exist and if necessary replace device function definitions with declarations. However, this is a breaking change and we have to communicate such changes to users before we make such changes.

On the other hand, most HIP apps do not care about the value of these macros in host compilation, as long as they exist to make the device code compile in host compilation. In most cases, which value to take does not affect the host code generated. In rare cases, it may cause ODR violation, e.g. defining struct type kernel argument layout using these macros.

I think for now we should keep the original behavior for HIP host compilation, and document in https://clang.llvm.org/docs/HIPSupport.html that these macros take default values in host compilation which may differ from device compilation and there is risk of ODR violation if using them in host compilation. At the same time, I will open some internal tickets trying to eliminate their usage in host compilation in ROCm.

https://github.com/llvm/llvm-project/pull/79660


More information about the cfe-commits mailing list