[llvm] AMDGPU: Make v2f64 -> v2f16 conversion Legal only when unsafe fast math is set (PR #134738)
Changpeng Fang via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 17 11:05:31 PDT 2025
================
@@ -2741,6 +2743,29 @@ bool AMDGPULegalizerInfo::legalizeMinNumMaxNum(LegalizerHelper &Helper,
return Helper.lowerFMinNumMaxNum(MI) == LegalizerHelper::Legalized;
}
+bool AMDGPULegalizerInfo::legalizeFPTrunc(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ MachineRegisterInfo &MRI) const {
+ // TODO: We should only use fast math flag. But the global option is
+ // still used here to be consistent, especially when the fast math flag is
+ // not working for FP_ROUND on the SelectDAG path at this moment.
+ MachineFunction &MF = Helper.MIRBuilder.getMF();
+ bool AllowInaccurateFPTRUNC = MI.getFlag(MachineInstr::FmAfn) ||
+ MF.getTarget().Options.UnsafeFPMath;
+
+ if (AllowInaccurateFPTRUNC) {
+ // Use the tablegen pattern to select native instructions.
+ return true;
+ }
+
+ Register DstReg = MI.getOperand(0).getReg();
+ LLT DstTy = MRI.getType(DstReg);
+
+ // Scalarize the vector and fall through to lower f64 -> f16.
+ return Helper.fewerElementsVector(MI, 0, DstTy.getElementType()) ==
+ LegalizerHelper::Legalized;
----------------
changpeng wrote:
> This seems like it should be a combine and not a modification of legalization rules
I am looking at instcombine to generate v_cvt_pk_f16_f32 instructions. Should we implement in AMDGPU because it is target dependent? Thanks.
https://github.com/llvm/llvm-project/pull/134738
More information about the llvm-commits
mailing list