[llvm] [RISCV][TTI] Support cost of f16 FCmp using zvfhmin in the absence of… (PR #89166)

Fri Jan 3 23:20:19 PST 2025

================
@@ -1398,12 +1398,28 @@ InstructionCost RISCVTTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
     // one which will calculate as:
     // ScalarizeCost + Num * Cost for fixed vector,
     // InvalidCost for scalable vector.
-    if ((ValTy->getScalarSizeInBits() == 16 && !ST->hasVInstructionsF16()) ||
+    if ((ValTy->getScalarSizeInBits() == 16 &&
+         !ST->hasVInstructionsF16Minimal()) ||
         (ValTy->getScalarSizeInBits() == 32 && !ST->hasVInstructionsF32()) ||
         (ValTy->getScalarSizeInBits() == 64 && !ST->hasVInstructionsF64()))
       return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind,
                                        I);
 
+    if ((ValTy->getScalarSizeInBits() == 16) && !ST->hasVInstructionsF16()) {
+      // pre-widening Op1 and Op2 to f32 before comparison
+      VectorType *VecF32Ty =
+          VectorType::get(Type::getFloatTy(ValTy->getContext()),
+                          cast<VectorType>(ValTy)->getElementCount());
+      std::pair<InstructionCost, MVT> VecF32LT =
+          getTypeLegalizationCost(VecF32Ty);
+      InstructionCost WidenCost =
+          2 * getRISCVInstructionCost(RISCV::VFWCVT_F_F_V, VecF32LT.second,
+                                      CostKind);
+      InstructionCost CmpCost =
+          getCmpSelInstrCost(Opcode, VecF32Ty, CondTy, VecPred, CostKind, I);
+      return VecF32LT.first * WidenCost + CmpCost;
----------------
arcbbb wrote:

I discovered an issue with handling `vscale x 32 x f16`, since the vtype doesn't work for vfwcvt. To fix this, I switched to using getCastInstrCost to calculate the split and widen costs.

https://github.com/llvm/llvm-project/pull/89166