[llvm] [AMDGPU] Support bfloat comparison for ballot intrinsic (PR #165495)
Changpeng Fang via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 29 10:29:59 PDT 2025
================
@@ -7035,9 +7035,15 @@ static SDValue lowerBALLOTIntrinsic(const SITargetLowering &TLI, SDNode *N,
SDLoc SL(N);
if (Src.getOpcode() == ISD::SETCC) {
+ SDValue Op0 = Src.getOperand(0);
+ SDValue Op1 = Src.getOperand(1);
+ // Need to expand bfloat to float for comparison (setcc).
----------------
changpeng wrote:
Yes, legalization of ISD::SETCC correctly promotes bf16 to f32. But apparently the ballot intrinsic is lowered to AMDGPUISD::SETCC here, so we have to promote bf16 to f32. What is the "generic machinery to promote bf16"?
Look at the "lowerFCMPIntrinsic" above this function, a similar approach was sued to promote f16 to f32 when f16 is not legal.
https://github.com/llvm/llvm-project/pull/165495
More information about the llvm-commits
mailing list