[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)
Joseph Huber via cfe-commits
cfe-commits at lists.llvm.org
Mon Jan 29 09:00:03 PST 2024
jhuber6 wrote:
This seems to have perturbed the HIP build. https://lab.llvm.org/staging/#/builders/22/builds/22
The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host compilation as well in a bunch of the wave function macros. I think that this is just poor programming, because the host compilation has no what of knowing what the wave size is considering the fact that `--offload-arch=gfx1030,gfx90a` is totally legal and would result in two conflicting wave front sizes from the perspective of the host. The old behavior would just default this to `64` all the time.
We have two solutions, fix the headers so they don't rely on a device-only macro, or just update this to emit a dummy macro on the host.
I like the first better, but given that we've shipped this with ROCm in the past it's likely to require a workaround. @yxsamliu What do you think?
https://github.com/llvm/llvm-project/pull/79660
More information about the cfe-commits
mailing list