[llvm] [LLVM][CodeGen][AArch64] Don't scalarise v8{f16,bf16} vsetcc operations. (PR #135398)

Tue Apr 15 08:31:15 PDT 2025

================
@@ -4236,9 +4236,11 @@ InstructionCost AArch64TTIImpl::getCmpSelInstrCost(
 
   if (isa<FixedVectorType>(ValTy) && ISD == ISD::SETCC) {
     auto LT = getTypeLegalizationCost(ValTy);
-    // Cost v4f16 FCmp without FP16 support via converting to v4f32 and back.
+    // Cost v#f16 FCmp without FP16 support via converting to v#f32 and back.
     if (LT.second == MVT::v4f16 && !ST->hasFullFP16())
       return LT.first * 4; // fcvtl + fcvtl + fcmp + xtn
+    if (LT.second == MVT::v8f16 && !ST->hasFullFP16())
+      return LT.first * 8; // 2*(fcvtl + fcvtl2 + fcmp) + uzp1 + xtn
----------------
paulwalker-arm wrote:

I've refactored the costing under https://github.com/llvm/llvm-project/pull/135795.  That contains a FIXME for the path that is currently scalarised, which this PR will remove.

https://github.com/llvm/llvm-project/pull/135398