[clang] [AMDGPU] Allow w64 ballot to be used on w32 targets (PR #80183)
Joseph Huber via cfe-commits
cfe-commits at lists.llvm.org
Wed Jan 31 11:44:12 PST 2024
================
@@ -151,7 +151,7 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
//===----------------------------------------------------------------------===//
TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
-TARGET_BUILTIN(__builtin_amdgcn_ballot_w64, "WUib", "nc", "wavefrontsize64")
+BUILTIN(__builtin_amdgcn_ballot_w64, "WUib", "nc")
----------------
jhuber6 wrote:
The difference is simply on the return value from the intrinsic. It's always legal to do type promotion, so on a system with a 32 wide wavefront it will just get promoted to a 64-bit value which will be correct.
The ROCm-Device-Libs can only do this because they have the `__oclc_wavefrontsize64` variable, which I don't want to copy.
https://github.com/llvm/llvm-project/pull/80183
More information about the cfe-commits
mailing list