[llvm] AMDGPU: Make v2f64 -> v2f16 conversion Legal only when unsafe fast math is set (PR #134738)

Thu Apr 17 10:58:28 PDT 2025

================
@@ -2741,6 +2743,29 @@ bool AMDGPULegalizerInfo::legalizeMinNumMaxNum(LegalizerHelper &Helper,
   return Helper.lowerFMinNumMaxNum(MI) == LegalizerHelper::Legalized;
 }
 
+bool AMDGPULegalizerInfo::legalizeFPTrunc(LegalizerHelper &Helper,
+                                          MachineInstr &MI,
+                                          MachineRegisterInfo &MRI) const {
+  // TODO: We should only use fast math flag. But the global option is
+  // still used here to be consistent, especially when the fast math flag is
+  // not working for FP_ROUND on the SelectDAG path at this moment.
+  MachineFunction &MF = Helper.MIRBuilder.getMF();
+  bool AllowInaccurateFPTRUNC = MI.getFlag(MachineInstr::FmAfn) ||
+                                MF.getTarget().Options.UnsafeFPMath;
+
+  if (AllowInaccurateFPTRUNC) {
+    // Use the tablegen pattern to select native instructions.
+    return true;
+  }
+
----------------
changpeng wrote:

> This handling logic doesn't make sense. The decision to scalarize or not has nothing to do with the FP properties? The tablegen pattern also handles the build_vector case, which is the MIR I would expect to select from. Can we just drop the direct vector handling, and always scalarize?
> 
> The tablegen patterns should also be guarded with the fast checks if we're going to have them.

The logic to generate instructions based on the pattern, but this is only correct when the unsafe fpmath is set.
Yes, we may always scalarize. Should we do instcombine to generate v_cvt_pk_f16_f32?  

https://github.com/llvm/llvm-project/pull/134738