[llvm] [AMDGPU] Support bfloat comparison for ballot intrinsic (PR #165495)

Changpeng Fang via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 29 10:29:59 PDT 2025


================
@@ -7035,9 +7035,15 @@ static SDValue lowerBALLOTIntrinsic(const SITargetLowering &TLI, SDNode *N,
   SDLoc SL(N);
 
   if (Src.getOpcode() == ISD::SETCC) {
+    SDValue Op0 = Src.getOperand(0);
+    SDValue Op1 = Src.getOperand(1);
+    // Need to expand bfloat to float for comparison (setcc).
----------------
changpeng wrote:

Yes, legalization of ISD::SETCC correctly promotes bf16 to f32. But apparently the ballot intrinsic is lowered to AMDGPUISD::SETCC here, so we have to promote bf16 to f32. What is the "generic machinery to promote bf16"?  

Look at the "lowerFCMPIntrinsic" above this function, a similar approach was sued to promote f16 to f32 when f16 is not legal.

https://github.com/llvm/llvm-project/pull/165495


More information about the llvm-commits mailing list