[llvm] [AArch64][CostModel] Increase the cost of illegal SVE int-to-fp converts (PR #130756)

Graham Hunter via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 11 04:39:15 PDT 2025


https://github.com/huntergr-arm created https://github.com/llvm/llvm-project/pull/130756

If a scalable vector uitofp or sitofp effectively extend the size of each element as part of the conversion, the AArch64 backend will need to plant multiple unpacks before converting.

>From 8f23b433f0e32d2133c52b1fae78dbacfa070672 Mon Sep 17 00:00:00 2001
From: Graham Hunter <graham.hunter at arm.com>
Date: Wed, 5 Mar 2025 14:34:54 +0000
Subject: [PATCH] [AArch64][CostModel] Increase the cost of illegal SVE
 int-to-fp converts

If a scalable vector uitofp or sitofp effectively extend the size of each
element as part of the conversion, the AArch64 backend will need to plant
multiple unpacks before converting.
---
 .../AArch64/AArch64TargetTransformInfo.cpp    |  15 +
 .../Analysis/CostModel/AArch64/sve-itofp.ll   | 268 ++++++++++++++++++
 2 files changed, 283 insertions(+)
 create mode 100644 llvm/test/Analysis/CostModel/AArch64/sve-itofp.ll

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 7cec8a17dfaaa..8091fb8f990bf 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -3144,6 +3144,21 @@ InstructionCost AArch64TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
       {ISD::SIGN_EXTEND, MVT::nxv8i32, MVT::nxv8i16, 2},
       {ISD::SIGN_EXTEND, MVT::nxv8i64, MVT::nxv8i16, 6},
       {ISD::SIGN_EXTEND, MVT::nxv4i64, MVT::nxv4i32, 2},
+
+      // Add cost for extending and converting to illegal -too wide- scalable
+      // Extending one size (e.g. i32 -> f64) takes 2 unpacks and 2 fcvts, while
+      // extending twice (e.g. i16 -> f64) takes 6 unpacks and 4 fcvts.
+      {ISD::SINT_TO_FP, MVT::nxv16f16, MVT::nxv16i8, 12},
+      {ISD::SINT_TO_FP, MVT::nxv16f32, MVT::nxv16i8, 22},
+      {ISD::SINT_TO_FP, MVT::nxv8f32, MVT::nxv8i16, 12},
+      {ISD::SINT_TO_FP, MVT::nxv8f64, MVT::nxv8i16, 22},
+      {ISD::SINT_TO_FP, MVT::nxv4f64, MVT::nxv4i32, 12},
+
+      {ISD::UINT_TO_FP, MVT::nxv16f16, MVT::nxv16i8, 12},
+      {ISD::UINT_TO_FP, MVT::nxv16f32, MVT::nxv16i8, 22},
+      {ISD::UINT_TO_FP, MVT::nxv8f32, MVT::nxv8i16, 12},
+      {ISD::UINT_TO_FP, MVT::nxv8f64, MVT::nxv8i16, 22},
+      {ISD::UINT_TO_FP, MVT::nxv4f64, MVT::nxv4i32, 12},
   };
 
   // We have to estimate a cost of fixed length operation upon
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-itofp.ll b/llvm/test/Analysis/CostModel/AArch64/sve-itofp.ll
new file mode 100644
index 0000000000000..12fd6411255f2
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-itofp.ll
@@ -0,0 +1,268 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple aarch64-linux-gnu -mattr=+sve -o - -S < %s | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-unknown-linux-gnu"
+
+define void @sve-itofp() {
+; CHECK-LABEL: 'sve-itofp'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si8_to_f16 = sitofp <vscale x 1 x i8> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui8_to_f16 = uitofp <vscale x 1 x i8> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si16_to_f16 = sitofp <vscale x 1 x i16> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui16_to_f16 = uitofp <vscale x 1 x i16> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si32_to_f16 = sitofp <vscale x 1 x i32> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui32_to_f16 = uitofp <vscale x 1 x i32> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si64_to_f16 = sitofp <vscale x 1 x i64> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui64_to_f16 = uitofp <vscale x 1 x i64> undef to <vscale x 1 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si8_to_f32 = sitofp <vscale x 1 x i8> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui8_to_f32 = uitofp <vscale x 1 x i8> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si16_to_f32 = sitofp <vscale x 1 x i16> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui16_to_f32 = uitofp <vscale x 1 x i16> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si32_to_f32 = sitofp <vscale x 1 x i32> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui32_to_f32 = uitofp <vscale x 1 x i32> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si64_to_f32 = sitofp <vscale x 1 x i64> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui64_to_f32 = uitofp <vscale x 1 x i64> undef to <vscale x 1 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si8_to_f64 = sitofp <vscale x 1 x i8> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui8_to_f64 = uitofp <vscale x 1 x i8> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si16_to_f64 = sitofp <vscale x 1 x i16> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui16_to_f64 = uitofp <vscale x 1 x i16> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si32_to_f64 = sitofp <vscale x 1 x i32> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui32_to_f64 = uitofp <vscale x 1 x i32> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1si64_to_f64 = sitofp <vscale x 1 x i64> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv1ui64_to_f64 = uitofp <vscale x 1 x i64> undef to <vscale x 1 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si8_to_f16 = sitofp <vscale x 2 x i8> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui8_to_f16 = uitofp <vscale x 2 x i8> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si16_to_f16 = sitofp <vscale x 2 x i16> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui16_to_f16 = uitofp <vscale x 2 x i16> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si32_to_f16 = sitofp <vscale x 2 x i32> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui32_to_f16 = uitofp <vscale x 2 x i32> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si64_to_f16 = sitofp <vscale x 2 x i64> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui64_to_f16 = uitofp <vscale x 2 x i64> undef to <vscale x 2 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si8_to_f32 = sitofp <vscale x 2 x i8> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui8_to_f32 = uitofp <vscale x 2 x i8> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si16_to_f32 = sitofp <vscale x 2 x i16> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui16_to_f32 = uitofp <vscale x 2 x i16> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si32_to_f32 = sitofp <vscale x 2 x i32> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui32_to_f32 = uitofp <vscale x 2 x i32> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si64_to_f32 = sitofp <vscale x 2 x i64> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui64_to_f32 = uitofp <vscale x 2 x i64> undef to <vscale x 2 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si8_to_f64 = sitofp <vscale x 2 x i8> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui8_to_f64 = uitofp <vscale x 2 x i8> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si16_to_f64 = sitofp <vscale x 2 x i16> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui16_to_f64 = uitofp <vscale x 2 x i16> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si32_to_f64 = sitofp <vscale x 2 x i32> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui32_to_f64 = uitofp <vscale x 2 x i32> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2si64_to_f64 = sitofp <vscale x 2 x i64> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv2ui64_to_f64 = uitofp <vscale x 2 x i64> undef to <vscale x 2 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4si8_to_f16 = sitofp <vscale x 4 x i8> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4ui8_to_f16 = uitofp <vscale x 4 x i8> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4si16_to_f16 = sitofp <vscale x 4 x i16> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4ui16_to_f16 = uitofp <vscale x 4 x i16> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4si32_to_f16 = sitofp <vscale x 4 x i32> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4ui32_to_f16 = uitofp <vscale x 4 x i32> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4si64_to_f16 = sitofp <vscale x 4 x i64> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4ui64_to_f16 = uitofp <vscale x 4 x i64> undef to <vscale x 4 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4si8_to_f32 = sitofp <vscale x 4 x i8> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4ui8_to_f32 = uitofp <vscale x 4 x i8> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4si16_to_f32 = sitofp <vscale x 4 x i16> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4ui16_to_f32 = uitofp <vscale x 4 x i16> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4si32_to_f32 = sitofp <vscale x 4 x i32> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv4ui32_to_f32 = uitofp <vscale x 4 x i32> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4si64_to_f32 = sitofp <vscale x 4 x i64> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4ui64_to_f32 = uitofp <vscale x 4 x i64> undef to <vscale x 4 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4si8_to_f64 = sitofp <vscale x 4 x i8> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4ui8_to_f64 = uitofp <vscale x 4 x i8> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4si16_to_f64 = sitofp <vscale x 4 x i16> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv4ui16_to_f64 = uitofp <vscale x 4 x i16> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv4si32_to_f64 = sitofp <vscale x 4 x i32> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv4ui32_to_f64 = uitofp <vscale x 4 x i32> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nv4si64_to_f64 = sitofp <vscale x 4 x i64> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nv4ui64_to_f64 = uitofp <vscale x 4 x i64> undef to <vscale x 4 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv8si8_to_f16 = sitofp <vscale x 8 x i8> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv8ui8_to_f16 = uitofp <vscale x 8 x i8> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv8si16_to_f16 = sitofp <vscale x 8 x i16> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %nv8ui16_to_f16 = uitofp <vscale x 8 x i16> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv8si32_to_f16 = sitofp <vscale x 8 x i32> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv8ui32_to_f16 = uitofp <vscale x 8 x i32> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %nv8si64_to_f16 = sitofp <vscale x 8 x i64> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %nv8ui64_to_f16 = uitofp <vscale x 8 x i64> undef to <vscale x 8 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv8si8_to_f32 = sitofp <vscale x 8 x i8> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %nv8ui8_to_f32 = uitofp <vscale x 8 x i8> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv8si16_to_f32 = sitofp <vscale x 8 x i16> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv8ui16_to_f32 = uitofp <vscale x 8 x i16> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nv8si32_to_f32 = sitofp <vscale x 8 x i32> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nv8ui32_to_f32 = uitofp <vscale x 8 x i32> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %nv8si64_to_f32 = sitofp <vscale x 8 x i64> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %nv8ui64_to_f32 = uitofp <vscale x 8 x i64> undef to <vscale x 8 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %nv8si8_to_f64 = sitofp <vscale x 8 x i8> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %nv8ui8_to_f64 = uitofp <vscale x 8 x i8> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 22 for instruction: %nv8si16_to_f64 = sitofp <vscale x 8 x i16> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 22 for instruction: %nv8ui16_to_f64 = uitofp <vscale x 8 x i16> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 24 for instruction: %nv8si32_to_f64 = sitofp <vscale x 8 x i32> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 24 for instruction: %nv8ui32_to_f64 = uitofp <vscale x 8 x i32> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %nv8si64_to_f64 = sitofp <vscale x 8 x i64> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %nv8ui64_to_f64 = uitofp <vscale x 8 x i64> undef to <vscale x 8 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv16si8_to_f16 = sitofp <vscale x 16 x i8> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv16ui8_to_f16 = uitofp <vscale x 16 x i8> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nv16si16_to_f16 = sitofp <vscale x 16 x i16> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nv16ui16_to_f16 = uitofp <vscale x 16 x i16> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %nv16si32_to_f16 = sitofp <vscale x 16 x i32> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %nv16ui32_to_f16 = uitofp <vscale x 16 x i32> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %nv16si64_to_f16 = sitofp <vscale x 16 x i64> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %nv16ui64_to_f16 = uitofp <vscale x 16 x i64> undef to <vscale x 16 x half>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 22 for instruction: %nv16si8_to_f32 = sitofp <vscale x 16 x i8> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 22 for instruction: %nv16ui8_to_f32 = uitofp <vscale x 16 x i8> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 24 for instruction: %nv16si16_to_f32 = sitofp <vscale x 16 x i16> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 24 for instruction: %nv16ui16_to_f32 = uitofp <vscale x 16 x i16> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %nv16si32_to_f32 = sitofp <vscale x 16 x i32> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %nv16ui32_to_f32 = uitofp <vscale x 16 x i32> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv16si64_to_f32 = sitofp <vscale x 16 x i64> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %nv16ui64_to_f32 = uitofp <vscale x 16 x i64> undef to <vscale x 16 x float>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %nv16si8_to_f64 = sitofp <vscale x 16 x i8> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %nv16ui8_to_f64 = uitofp <vscale x 16 x i8> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 44 for instruction: %nv16si16_to_f64 = sitofp <vscale x 16 x i16> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 44 for instruction: %nv16ui16_to_f64 = uitofp <vscale x 16 x i16> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 48 for instruction: %nv16si32_to_f64 = sitofp <vscale x 16 x i32> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 48 for instruction: %nv16ui32_to_f64 = uitofp <vscale x 16 x i32> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %nv16si64_to_f64 = sitofp <vscale x 16 x i64> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %nv16ui64_to_f64 = uitofp <vscale x 16 x i64> undef to <vscale x 16 x double>
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+
+  %nv1si8_to_f16  = sitofp <vscale x 1 x i8> undef to <vscale x 1 x half>
+  %nv1ui8_to_f16  = uitofp <vscale x 1 x i8> undef to <vscale x 1 x half>
+  %nv1si16_to_f16 = sitofp <vscale x 1 x i16> undef to <vscale x 1 x half>
+  %nv1ui16_to_f16 = uitofp <vscale x 1 x i16> undef to <vscale x 1 x half>
+  %nv1si32_to_f16 = sitofp <vscale x 1 x i32> undef to <vscale x 1 x half>
+  %nv1ui32_to_f16 = uitofp <vscale x 1 x i32> undef to <vscale x 1 x half>
+  %nv1si64_to_f16 = sitofp <vscale x 1 x i64> undef to <vscale x 1 x half>
+  %nv1ui64_to_f16 = uitofp <vscale x 1 x i64> undef to <vscale x 1 x half>
+
+  %nv1si8_to_f32  = sitofp <vscale x 1 x i8> undef to <vscale x 1 x float>
+  %nv1ui8_to_f32  = uitofp <vscale x 1 x i8> undef to <vscale x 1 x float>
+  %nv1si16_to_f32 = sitofp <vscale x 1 x i16> undef to <vscale x 1 x float>
+  %nv1ui16_to_f32 = uitofp <vscale x 1 x i16> undef to <vscale x 1 x float>
+  %nv1si32_to_f32 = sitofp <vscale x 1 x i32> undef to <vscale x 1 x float>
+  %nv1ui32_to_f32 = uitofp <vscale x 1 x i32> undef to <vscale x 1 x float>
+  %nv1si64_to_f32 = sitofp <vscale x 1 x i64> undef to <vscale x 1 x float>
+  %nv1ui64_to_f32 = uitofp <vscale x 1 x i64> undef to <vscale x 1 x float>
+
+  %nv1si8_to_f64  = sitofp <vscale x 1 x i8> undef to <vscale x 1 x double>
+  %nv1ui8_to_f64  = uitofp <vscale x 1 x i8> undef to <vscale x 1 x double>
+  %nv1si16_to_f64 = sitofp <vscale x 1 x i16> undef to <vscale x 1 x double>
+  %nv1ui16_to_f64 = uitofp <vscale x 1 x i16> undef to <vscale x 1 x double>
+  %nv1si32_to_f64 = sitofp <vscale x 1 x i32> undef to <vscale x 1 x double>
+  %nv1ui32_to_f64 = uitofp <vscale x 1 x i32> undef to <vscale x 1 x double>
+  %nv1si64_to_f64 = sitofp <vscale x 1 x i64> undef to <vscale x 1 x double>
+  %nv1ui64_to_f64 = uitofp <vscale x 1 x i64> undef to <vscale x 1 x double>
+
+  %nv2si8_to_f16  = sitofp <vscale x 2 x i8> undef to <vscale x 2 x half>
+  %nv2ui8_to_f16  = uitofp <vscale x 2 x i8> undef to <vscale x 2 x half>
+  %nv2si16_to_f16 = sitofp <vscale x 2 x i16> undef to <vscale x 2 x half>
+  %nv2ui16_to_f16 = uitofp <vscale x 2 x i16> undef to <vscale x 2 x half>
+  %nv2si32_to_f16 = sitofp <vscale x 2 x i32> undef to <vscale x 2 x half>
+  %nv2ui32_to_f16 = uitofp <vscale x 2 x i32> undef to <vscale x 2 x half>
+  %nv2si64_to_f16 = sitofp <vscale x 2 x i64> undef to <vscale x 2 x half>
+  %nv2ui64_to_f16 = uitofp <vscale x 2 x i64> undef to <vscale x 2 x half>
+
+  %nv2si8_to_f32  = sitofp <vscale x 2 x i8> undef to <vscale x 2 x float>
+  %nv2ui8_to_f32  = uitofp <vscale x 2 x i8> undef to <vscale x 2 x float>
+  %nv2si16_to_f32 = sitofp <vscale x 2 x i16> undef to <vscale x 2 x float>
+  %nv2ui16_to_f32 = uitofp <vscale x 2 x i16> undef to <vscale x 2 x float>
+  %nv2si32_to_f32 = sitofp <vscale x 2 x i32> undef to <vscale x 2 x float>
+  %nv2ui32_to_f32 = uitofp <vscale x 2 x i32> undef to <vscale x 2 x float>
+  %nv2si64_to_f32 = sitofp <vscale x 2 x i64> undef to <vscale x 2 x float>
+  %nv2ui64_to_f32 = uitofp <vscale x 2 x i64> undef to <vscale x 2 x float>
+
+  %nv2si8_to_f64  = sitofp <vscale x 2 x i8> undef to <vscale x 2 x double>
+  %nv2ui8_to_f64  = uitofp <vscale x 2 x i8> undef to <vscale x 2 x double>
+  %nv2si16_to_f64 = sitofp <vscale x 2 x i16> undef to <vscale x 2 x double>
+  %nv2ui16_to_f64 = uitofp <vscale x 2 x i16> undef to <vscale x 2 x double>
+  %nv2si32_to_f64 = sitofp <vscale x 2 x i32> undef to <vscale x 2 x double>
+  %nv2ui32_to_f64 = uitofp <vscale x 2 x i32> undef to <vscale x 2 x double>
+  %nv2si64_to_f64 = sitofp <vscale x 2 x i64> undef to <vscale x 2 x double>
+  %nv2ui64_to_f64 = uitofp <vscale x 2 x i64> undef to <vscale x 2 x double>
+
+  %nv4si8_to_f16  = sitofp <vscale x 4 x i8> undef to <vscale x 4 x half>
+  %nv4ui8_to_f16  = uitofp <vscale x 4 x i8> undef to <vscale x 4 x half>
+  %nv4si16_to_f16 = sitofp <vscale x 4 x i16> undef to <vscale x 4 x half>
+  %nv4ui16_to_f16 = uitofp <vscale x 4 x i16> undef to <vscale x 4 x half>
+  %nv4si32_to_f16 = sitofp <vscale x 4 x i32> undef to <vscale x 4 x half>
+  %nv4ui32_to_f16 = uitofp <vscale x 4 x i32> undef to <vscale x 4 x half>
+  %nv4si64_to_f16 = sitofp <vscale x 4 x i64> undef to <vscale x 4 x half>
+  %nv4ui64_to_f16 = uitofp <vscale x 4 x i64> undef to <vscale x 4 x half>
+
+  %nv4si8_to_f32  = sitofp <vscale x 4 x i8> undef to <vscale x 4 x float>
+  %nv4ui8_to_f32  = uitofp <vscale x 4 x i8> undef to <vscale x 4 x float>
+  %nv4si16_to_f32 = sitofp <vscale x 4 x i16> undef to <vscale x 4 x float>
+  %nv4ui16_to_f32 = uitofp <vscale x 4 x i16> undef to <vscale x 4 x float>
+  %nv4si32_to_f32 = sitofp <vscale x 4 x i32> undef to <vscale x 4 x float>
+  %nv4ui32_to_f32 = uitofp <vscale x 4 x i32> undef to <vscale x 4 x float>
+  %nv4si64_to_f32 = sitofp <vscale x 4 x i64> undef to <vscale x 4 x float>
+  %nv4ui64_to_f32 = uitofp <vscale x 4 x i64> undef to <vscale x 4 x float>
+
+  %nv4si8_to_f64  = sitofp <vscale x 4 x i8> undef to <vscale x 4 x double>
+  %nv4ui8_to_f64  = uitofp <vscale x 4 x i8> undef to <vscale x 4 x double>
+  %nv4si16_to_f64 = sitofp <vscale x 4 x i16> undef to <vscale x 4 x double>
+  %nv4ui16_to_f64 = uitofp <vscale x 4 x i16> undef to <vscale x 4 x double>
+  %nv4si32_to_f64 = sitofp <vscale x 4 x i32> undef to <vscale x 4 x double>
+  %nv4ui32_to_f64 = uitofp <vscale x 4 x i32> undef to <vscale x 4 x double>
+  %nv4si64_to_f64 = sitofp <vscale x 4 x i64> undef to <vscale x 4 x double>
+  %nv4ui64_to_f64 = uitofp <vscale x 4 x i64> undef to <vscale x 4 x double>
+
+  %nv8si8_to_f16  = sitofp <vscale x 8 x i8> undef to <vscale x 8 x half>
+  %nv8ui8_to_f16  = uitofp <vscale x 8 x i8> undef to <vscale x 8 x half>
+  %nv8si16_to_f16 = sitofp <vscale x 8 x i16> undef to <vscale x 8 x half>
+  %nv8ui16_to_f16 = uitofp <vscale x 8 x i16> undef to <vscale x 8 x half>
+  %nv8si32_to_f16 = sitofp <vscale x 8 x i32> undef to <vscale x 8 x half>
+  %nv8ui32_to_f16 = uitofp <vscale x 8 x i32> undef to <vscale x 8 x half>
+  %nv8si64_to_f16 = sitofp <vscale x 8 x i64> undef to <vscale x 8 x half>
+  %nv8ui64_to_f16 = uitofp <vscale x 8 x i64> undef to <vscale x 8 x half>
+
+  %nv8si8_to_f32  = sitofp <vscale x 8 x i8> undef to <vscale x 8 x float>
+  %nv8ui8_to_f32  = uitofp <vscale x 8 x i8> undef to <vscale x 8 x float>
+  %nv8si16_to_f32 = sitofp <vscale x 8 x i16> undef to <vscale x 8 x float>
+  %nv8ui16_to_f32 = uitofp <vscale x 8 x i16> undef to <vscale x 8 x float>
+  %nv8si32_to_f32 = sitofp <vscale x 8 x i32> undef to <vscale x 8 x float>
+  %nv8ui32_to_f32 = uitofp <vscale x 8 x i32> undef to <vscale x 8 x float>
+  %nv8si64_to_f32 = sitofp <vscale x 8 x i64> undef to <vscale x 8 x float>
+  %nv8ui64_to_f32 = uitofp <vscale x 8 x i64> undef to <vscale x 8 x float>
+
+  %nv8si8_to_f64  = sitofp <vscale x 8 x i8> undef to <vscale x 8 x double>
+  %nv8ui8_to_f64  = uitofp <vscale x 8 x i8> undef to <vscale x 8 x double>
+  %nv8si16_to_f64 = sitofp <vscale x 8 x i16> undef to <vscale x 8 x double>
+  %nv8ui16_to_f64 = uitofp <vscale x 8 x i16> undef to <vscale x 8 x double>
+  %nv8si32_to_f64 = sitofp <vscale x 8 x i32> undef to <vscale x 8 x double>
+  %nv8ui32_to_f64 = uitofp <vscale x 8 x i32> undef to <vscale x 8 x double>
+  %nv8si64_to_f64 = sitofp <vscale x 8 x i64> undef to <vscale x 8 x double>
+  %nv8ui64_to_f64 = uitofp <vscale x 8 x i64> undef to <vscale x 8 x double>
+
+  %nv16si8_to_f16  = sitofp <vscale x 16 x i8> undef to <vscale x 16 x half>
+  %nv16ui8_to_f16  = uitofp <vscale x 16 x i8> undef to <vscale x 16 x half>
+  %nv16si16_to_f16 = sitofp <vscale x 16 x i16> undef to <vscale x 16 x half>
+  %nv16ui16_to_f16 = uitofp <vscale x 16 x i16> undef to <vscale x 16 x half>
+  %nv16si32_to_f16 = sitofp <vscale x 16 x i32> undef to <vscale x 16 x half>
+  %nv16ui32_to_f16 = uitofp <vscale x 16 x i32> undef to <vscale x 16 x half>
+  %nv16si64_to_f16 = sitofp <vscale x 16 x i64> undef to <vscale x 16 x half>
+  %nv16ui64_to_f16 = uitofp <vscale x 16 x i64> undef to <vscale x 16 x half>
+
+  %nv16si8_to_f32  = sitofp <vscale x 16 x i8> undef to <vscale x 16 x float>
+  %nv16ui8_to_f32  = uitofp <vscale x 16 x i8> undef to <vscale x 16 x float>
+  %nv16si16_to_f32 = sitofp <vscale x 16 x i16> undef to <vscale x 16 x float>
+  %nv16ui16_to_f32 = uitofp <vscale x 16 x i16> undef to <vscale x 16 x float>
+  %nv16si32_to_f32 = sitofp <vscale x 16 x i32> undef to <vscale x 16 x float>
+  %nv16ui32_to_f32 = uitofp <vscale x 16 x i32> undef to <vscale x 16 x float>
+  %nv16si64_to_f32 = sitofp <vscale x 16 x i64> undef to <vscale x 16 x float>
+  %nv16ui64_to_f32 = uitofp <vscale x 16 x i64> undef to <vscale x 16 x float>
+
+  %nv16si8_to_f64  = sitofp <vscale x 16 x i8> undef to <vscale x 16 x double>
+  %nv16ui8_to_f64  = uitofp <vscale x 16 x i8> undef to <vscale x 16 x double>
+  %nv16si16_to_f64 = sitofp <vscale x 16 x i16> undef to <vscale x 16 x double>
+  %nv16ui16_to_f64 = uitofp <vscale x 16 x i16> undef to <vscale x 16 x double>
+  %nv16si32_to_f64 = sitofp <vscale x 16 x i32> undef to <vscale x 16 x double>
+  %nv16ui32_to_f64 = uitofp <vscale x 16 x i32> undef to <vscale x 16 x double>
+  %nv16si64_to_f64 = sitofp <vscale x 16 x i64> undef to <vscale x 16 x double>
+  %nv16ui64_to_f64 = uitofp <vscale x 16 x i64> undef to <vscale x 16 x double>
+
+  ret void
+}



More information about the llvm-commits mailing list