[llvm] AMDGPU][True16][CodeGen] FP_Round f64 to f16 in true16 (PR #128911)
Brox Chen via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 24 17:14:19 PDT 2025
================
@@ -6899,7 +6899,17 @@ SDValue SITargetLowering::lowerFP_ROUND(SDValue Op, SelectionDAG &DAG) const {
if (Op.getOpcode() != ISD::FP_ROUND)
return Op;
- SDValue FpToFp16 = DAG.getNode(ISD::FP_TO_FP16, DL, MVT::i32, Src);
+ if (!Subtarget->has16BitInsts()) {
+ SDValue FpToFp16 = DAG.getNode(ISD::FP_TO_FP16, DL, MVT::i32, Src);
+ SDValue Trunc = DAG.getNode(ISD::TRUNCATE, DL, MVT::i16, FpToFp16);
+ return DAG.getNode(ISD::BITCAST, DL, MVT::f16, Trunc);
+ }
+ if (getTargetMachine().Options.UnsafeFPMath) {
+ SDValue Flags = Op.getOperand(1);
+ SDValue Src32 = DAG.getNode(ISD::FP_ROUND, DL, MVT::f32, Src, Flags);
+ return DAG.getNode(ISD::FP_ROUND, DL, MVT::f16, Src32, Flags);
+ }
----------------
broxigarchen wrote:
The LowerF64ToF16 should just be the safe path. Probably should change the function name for better clarification
https://github.com/llvm/llvm-project/pull/128911
More information about the llvm-commits
mailing list