[llvm] AMDGPU][True16][CodeGen] FP_Round f64 to f16 in true16 (PR #128911)

Sun Apr 27 22:05:38 PDT 2025

================
@@ -6899,7 +6899,17 @@ SDValue SITargetLowering::lowerFP_ROUND(SDValue Op, SelectionDAG &DAG) const {
     if (Op.getOpcode() != ISD::FP_ROUND)
       return Op;
 
-    SDValue FpToFp16 = DAG.getNode(ISD::FP_TO_FP16, DL, MVT::i32, Src);
+    if (!Subtarget->has16BitInsts()) {
+      SDValue FpToFp16 = DAG.getNode(ISD::FP_TO_FP16, DL, MVT::i32, Src);
+      SDValue Trunc = DAG.getNode(ISD::TRUNCATE, DL, MVT::i16, FpToFp16);
+      return DAG.getNode(ISD::BITCAST, DL, MVT::f16, Trunc);
+    }
----------------
broxigarchen wrote:

Before this patch, FP_TO_FP16 is used for all targets for FP_ROUND f64 source so I think it suppose to do the right thing. However, I took a quick look at lit tests and we seems don't have any lit test that covers FP_ROUND f64 source in targets that do not support f16 types.

So I would say the FP_TO_FP16 here at least keeps the same routine for targets that do not support f16 types.

https://github.com/llvm/llvm-project/pull/128911