[llvm] [RISCV] Cost @llvm.vector.{extract, insert} as free at index 0 (PR #81818)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 14 20:30:52 PST 2024


https://github.com/lukel97 created https://github.com/llvm/llvm-project/pull/81818

In #81751 we still weren't costing extracts of scalable subvectors from scalable vectors at index 0 as free.
It turns out that if the subvector to extract is scalable, then `getIntrinsicInstrCost` is used instead of `getShuffleCost`. This handles the index = 0 case for the vector insert and extract intrinsics inside said hook.

Note we'll still need to keep the existing logic inside `getShuffleCost`, since anything that's not:

- a scalable extract of a scalable vector or
- a scalable insert into a scalable vector

will still go down that path. As well as existing fixed-length `shufflevector`s.

Also note that there's some shortcut logic in `BasicTTImplBase::getIntrinsicInstrCost` where if the target `getIntrinsicInstrCost` is free, it won't bother calling into `getShuffleCost`:

https://github.com/llvm/llvm-project/blob/fc0b67e1d79d1f199687f8f06d619984d9520230/llvm/include/llvm/CodeGen/BasicTTIImpl.h#L1534-L1538


>From 49ee1175b7bb2758e8a08d9fbfeccfe8e34bba94 Mon Sep 17 00:00:00 2001
From: Luke Lau <luke at igalia.com>
Date: Thu, 15 Feb 2024 12:00:40 +0800
Subject: [PATCH] [RISCV] Cost @llvm.vector.{extract,insert} as free at index 0

This builds upon #81751 by handling the @llvm.vector.{extract,insert}
intrinsics.

I believe we'll need the logic in both places as fixed vector extracts/inserts
of fixed vectors will use the shuffle cost hook, whereas anything with a
scalable return type will use the intrinsic cost hook.
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  28 +++++
 .../Analysis/CostModel/RISCV/rvv-shuffle.ll   |  12 +--
 .../CostModel/RISCV/rvv-vectorextract.ll      |  84 +++++++--------
 .../CostModel/RISCV/rvv-vectorinsert.ll       | 100 +++++++++---------
 4 files changed, 126 insertions(+), 98 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index d1db47a6061e4e..81d2b7cc1353af 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -809,6 +809,34 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
     }
     break;
   }
+  case Intrinsic::vector_extract: {
+    // A vector extract at index 0 is a (free) subregister extract.
+    if (auto *CIdx = dyn_cast<ConstantInt>(ICA.getArgs()[1]);
+        CIdx && CIdx->isZero())
+      return TTI::TCC_Free;
+    break;
+  }
+  case Intrinsic::vector_insert: {
+    auto FitsSubreg = [this](Type *Ty) {
+      if (!isa<ScalableVectorType>(Ty))
+        return false;
+      // Any scalable vector LMUL >= 1 will fit exactly into a register group.
+      auto [_Cost, LT] = getTypeLegalizationCost(Ty);
+      auto [_Coeff, Fractional] =
+          RISCVVType::decodeVLMUL(RISCVTargetLowering::getLMUL(LT));
+      return !Fractional;
+    };
+
+    // A vector insert at index 0 is a (free) subregister insert if:
+    //
+    // - The subvec fits exactly into a register group or
+    // - The vector is undef
+    if (auto *CIdx = dyn_cast<ConstantInt>(ICA.getArgs()[2]);
+        CIdx && CIdx->isZero() &&
+        (FitsSubreg(ICA.getArgTypes()[1]) || isa<UndefValue>(ICA.getArgs()[0])))
+      return TTI::TCC_Free;
+    break;
+  }
   // TODO: add more intrinsic
   case Intrinsic::experimental_stepvector: {
     unsigned Cost = 1; // vid
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
index 4f3c7e2f90c655..348a6cf380e97d 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-shuffle.ll
@@ -52,17 +52,17 @@ define void  @vector_broadcast() {
 
 define void @vector_insert_extract(<vscale x 4 x i32> %v0, <vscale x 16 x i32> %v1, <16 x i32> %v2) {
 ; CHECK-LABEL: 'vector_insert_extract'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %extract_fixed_from_scalable = call <16 x i32> @llvm.vector.extract.v16i32.nxv4i32(<vscale x 4 x i32> %v0, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %extract_fixed_from_scalable = call <16 x i32> @llvm.vector.extract.v16i32.nxv4i32(<vscale x 4 x i32> %v0, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %insert_fixed_into_scalable = call <vscale x 4 x i32> @llvm.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> %v0, <16 x i32> %v2, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %extract_scalable_from_scalable = call <vscale x 4 x i32> @llvm.vector.extract.nxv4i32.nxv16i32(<vscale x 16 x i32> %v1, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %insert_scalable_into_scalable = call <vscale x 16 x i32> @llvm.vector.insert.nxv16i32.nxv4i32(<vscale x 16 x i32> %v1, <vscale x 4 x i32> %v0, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %extract_scalable_from_scalable = call <vscale x 4 x i32> @llvm.vector.extract.nxv4i32.nxv16i32(<vscale x 16 x i32> %v1, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %insert_scalable_into_scalable = call <vscale x 16 x i32> @llvm.vector.insert.nxv16i32.nxv4i32(<vscale x 16 x i32> %v1, <vscale x 4 x i32> %v0, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; SIZE-LABEL: 'vector_insert_extract'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %extract_fixed_from_scalable = call <16 x i32> @llvm.vector.extract.v16i32.nxv4i32(<vscale x 4 x i32> %v0, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %extract_fixed_from_scalable = call <16 x i32> @llvm.vector.extract.v16i32.nxv4i32(<vscale x 4 x i32> %v0, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %insert_fixed_into_scalable = call <vscale x 4 x i32> @llvm.vector.insert.nxv4i32.v16i32(<vscale x 4 x i32> %v0, <16 x i32> %v2, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %extract_scalable_from_scalable = call <vscale x 4 x i32> @llvm.vector.extract.nxv4i32.nxv16i32(<vscale x 16 x i32> %v1, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %insert_scalable_into_scalable = call <vscale x 16 x i32> @llvm.vector.insert.nxv16i32.nxv4i32(<vscale x 16 x i32> %v1, <vscale x 4 x i32> %v0, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %extract_scalable_from_scalable = call <vscale x 4 x i32> @llvm.vector.extract.nxv4i32.nxv16i32(<vscale x 16 x i32> %v1, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %insert_scalable_into_scalable = call <vscale x 16 x i32> @llvm.vector.insert.nxv16i32.nxv4i32(<vscale x 16 x i32> %v1, <vscale x 4 x i32> %v0, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %extract_fixed_from_scalable = call <16 x i32> @llvm.vector.extract.v16i32.nxv4i32(<vscale x 4 x i32> %v0, i64 0)
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-vectorextract.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-vectorextract.ll
index 1e2d1f4d94954e..c4653ace9bac09 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-vectorextract.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-vectorextract.ll
@@ -4,37 +4,37 @@
 
 define void @vector_extract_nxv128i8_0(<vscale x 128 x i8> %v) {
 ; CHECK-LABEL: 'vector_extract_nxv128i8_0'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf8 = call <vscale x 1 x i8> @llvm.vector.extract.nxv1i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf4 = call <vscale x 2 x i8> @llvm.vector.extract.nxv2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf2 = call <vscale x 4 x i8> @llvm.vector.extract.nxv4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m1 = call <vscale x 8 x i8> @llvm.vector.extract.nxv8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m2 = call <vscale x 16 x i8> @llvm.vector.extract.nxv16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m4 = call <vscale x 32 x i8> @llvm.vector.extract.nxv32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m8 = call <vscale x 64 x i8> @llvm.vector.extract.nxv64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf8 = call <vscale x 1 x i8> @llvm.vector.extract.nxv1i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf4 = call <vscale x 2 x i8> @llvm.vector.extract.nxv2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf2 = call <vscale x 4 x i8> @llvm.vector.extract.nxv4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m1 = call <vscale x 8 x i8> @llvm.vector.extract.nxv8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m2 = call <vscale x 16 x i8> @llvm.vector.extract.nxv16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m4 = call <vscale x 32 x i8> @llvm.vector.extract.nxv32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m8 = call <vscale x 64 x i8> @llvm.vector.extract.nxv64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; SIZE-LABEL: 'vector_extract_nxv128i8_0'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf8 = call <vscale x 1 x i8> @llvm.vector.extract.nxv1i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf4 = call <vscale x 2 x i8> @llvm.vector.extract.nxv2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf2 = call <vscale x 4 x i8> @llvm.vector.extract.nxv4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m1 = call <vscale x 8 x i8> @llvm.vector.extract.nxv8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m2 = call <vscale x 16 x i8> @llvm.vector.extract.nxv16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m4 = call <vscale x 32 x i8> @llvm.vector.extract.nxv32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m8 = call <vscale x 64 x i8> @llvm.vector.extract.nxv64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf8 = call <vscale x 1 x i8> @llvm.vector.extract.nxv1i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf4 = call <vscale x 2 x i8> @llvm.vector.extract.nxv2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf2 = call <vscale x 4 x i8> @llvm.vector.extract.nxv4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m1 = call <vscale x 8 x i8> @llvm.vector.extract.nxv8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m2 = call <vscale x 16 x i8> @llvm.vector.extract.nxv16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m4 = call <vscale x 32 x i8> @llvm.vector.extract.nxv32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m8 = call <vscale x 64 x i8> @llvm.vector.extract.nxv64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %scalable_mf8 = call <vscale x 1 x i8> @llvm.vector.extract.nxv1i8.nxv128i8(<vscale x 128 x i8> %v, i64 0)
@@ -110,23 +110,23 @@ define void @vector_extract_nxv128i8_1(<vscale x 128 x i8> %v) {
 
 define void @vector_extract_v128i8_0(<128 x i8> %v) {
 ; CHECK-LABEL: 'vector_extract_v128i8_0'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.v128i8(<128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.v128i8(<128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.v128i8(<128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.v128i8(<128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.v128i8(<128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.v128i8(<128 x i8> %v, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.v128i8(<128 x i8> %v, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.v128i8(<128 x i8> %v, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; SIZE-LABEL: 'vector_extract_v128i8_0'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.v128i8(<128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.v128i8(<128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.v128i8(<128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.v128i8(<128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.v128i8(<128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.v128i8(<128 x i8> %v, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <4 x i8> @llvm.vector.extract.v4i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <8 x i8> @llvm.vector.extract.v8i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <16 x i8> @llvm.vector.extract.v16i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <32 x i8> @llvm.vector.extract.v32i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <64 x i8> @llvm.vector.extract.v64i8.v128i8(<128 x i8> %v, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.extract.v128i8.v128i8(<128 x i8> %v, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %fixed_mf8 = call <2 x i8> @llvm.vector.extract.v2i8.v128i8(<128 x i8> %v, i64 0)
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-vectorinsert.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-vectorinsert.ll
index 7a9f45cb7af50b..3dbcda2ac77645 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-vectorinsert.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-vectorinsert.ll
@@ -7,10 +7,10 @@ define void @vector_insert_nxv128i8_0(<vscale x 128 x i8> %v) {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> %v, <vscale x 1 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv2i8(<vscale x 128 x i8> %v, <vscale x 2 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv4i8(<vscale x 128 x i8> %v, <vscale x 4 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> %v, <vscale x 8 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> %v, <vscale x 16 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> %v, <vscale x 32 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> %v, <vscale x 64 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> %v, <vscale x 8 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> %v, <vscale x 16 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> %v, <vscale x 32 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> %v, <vscale x 64 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v2i8(<vscale x 128 x i8> %v, <2 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v4i8(<vscale x 128 x i8> %v, <4 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v8i8(<vscale x 128 x i8> %v, <8 x i8> undef, i64 0)
@@ -24,10 +24,10 @@ define void @vector_insert_nxv128i8_0(<vscale x 128 x i8> %v) {
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> %v, <vscale x 1 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv2i8(<vscale x 128 x i8> %v, <vscale x 2 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv4i8(<vscale x 128 x i8> %v, <vscale x 4 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> %v, <vscale x 8 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> %v, <vscale x 16 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> %v, <vscale x 32 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> %v, <vscale x 64 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> %v, <vscale x 8 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> %v, <vscale x 16 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> %v, <vscale x 32 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> %v, <vscale x 64 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v2i8(<vscale x 128 x i8> %v, <2 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v4i8(<vscale x 128 x i8> %v, <4 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v8i8(<vscale x 128 x i8> %v, <8 x i8> undef, i64 0)
@@ -57,37 +57,37 @@ define void @vector_insert_nxv128i8_0(<vscale x 128 x i8> %v) {
 
 define void @vector_insert_nxv128i8_undef_0() {
 ; CHECK-LABEL: 'vector_insert_nxv128i8_undef_0'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> undef, <vscale x 1 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv2i8(<vscale x 128 x i8> undef, <vscale x 2 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv4i8(<vscale x 128 x i8> undef, <vscale x 4 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> undef, <vscale x 8 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> undef, <vscale x 16 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> undef, <vscale x 32 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> undef, <vscale x 64 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v2i8(<vscale x 128 x i8> undef, <2 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v4i8(<vscale x 128 x i8> undef, <4 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v8i8(<vscale x 128 x i8> undef, <8 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v16i8(<vscale x 128 x i8> undef, <16 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v32i8(<vscale x 128 x i8> undef, <32 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v64i8(<vscale x 128 x i8> undef, <64 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %fixed_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v128i8(<vscale x 128 x i8> undef, <128 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> undef, <vscale x 1 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv2i8(<vscale x 128 x i8> undef, <vscale x 2 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv4i8(<vscale x 128 x i8> undef, <vscale x 4 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> undef, <vscale x 8 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> undef, <vscale x 16 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> undef, <vscale x 32 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> undef, <vscale x 64 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v2i8(<vscale x 128 x i8> undef, <2 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v4i8(<vscale x 128 x i8> undef, <4 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v8i8(<vscale x 128 x i8> undef, <8 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v16i8(<vscale x 128 x i8> undef, <16 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v32i8(<vscale x 128 x i8> undef, <32 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v64i8(<vscale x 128 x i8> undef, <64 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v128i8(<vscale x 128 x i8> undef, <128 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; SIZE-LABEL: 'vector_insert_nxv128i8_undef_0'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> undef, <vscale x 1 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv2i8(<vscale x 128 x i8> undef, <vscale x 2 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv4i8(<vscale x 128 x i8> undef, <vscale x 4 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> undef, <vscale x 8 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> undef, <vscale x 16 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> undef, <vscale x 32 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> undef, <vscale x 64 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v2i8(<vscale x 128 x i8> undef, <2 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v4i8(<vscale x 128 x i8> undef, <4 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v8i8(<vscale x 128 x i8> undef, <8 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v16i8(<vscale x 128 x i8> undef, <16 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v32i8(<vscale x 128 x i8> undef, <32 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v64i8(<vscale x 128 x i8> undef, <64 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %fixed_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v128i8(<vscale x 128 x i8> undef, <128 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> undef, <vscale x 1 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv2i8(<vscale x 128 x i8> undef, <vscale x 2 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv4i8(<vscale x 128 x i8> undef, <vscale x 4 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv8i8(<vscale x 128 x i8> undef, <vscale x 8 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv16i8(<vscale x 128 x i8> undef, <vscale x 16 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv32i8(<vscale x 128 x i8> undef, <vscale x 32 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %scalable_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv64i8(<vscale x 128 x i8> undef, <vscale x 64 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v2i8(<vscale x 128 x i8> undef, <2 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v4i8(<vscale x 128 x i8> undef, <4 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v8i8(<vscale x 128 x i8> undef, <8 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v16i8(<vscale x 128 x i8> undef, <16 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v32i8(<vscale x 128 x i8> undef, <32 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v64i8(<vscale x 128 x i8> undef, <64 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.v128i8(<vscale x 128 x i8> undef, <128 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %scalable_mf8 = call <vscale x 128 x i8> @llvm.vector.insert.nxv128i8.nxv1i8(<vscale x 128 x i8> undef, <vscale x 1 x i8> undef, i64 0)
@@ -247,23 +247,23 @@ define void @vector_insert_v128i8_0(<128 x i8> %v) {
 
 define void @vector_insert_v128i8_undef_0() {
 ; CHECK-LABEL: 'vector_insert_v128i8_undef_0'
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_mf8 = call <128 x i8> @llvm.vector.insert.v128i8.v2i8(<128 x i8> undef, <2 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_mf4 = call <128 x i8> @llvm.vector.insert.v128i8.v4i8(<128 x i8> undef, <4 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_mf2 = call <128 x i8> @llvm.vector.insert.v128i8.v8i8(<128 x i8> undef, <8 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m1 = call <128 x i8> @llvm.vector.insert.v128i8.v16i8(<128 x i8> undef, <16 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m2 = call <128 x i8> @llvm.vector.insert.v128i8.v32i8(<128 x i8> undef, <32 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m4 = call <128 x i8> @llvm.vector.insert.v128i8.v64i8(<128 x i8> undef, <64 x i8> undef, i64 0)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.insert.v128i8.v128i8(<128 x i8> undef, <128 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <128 x i8> @llvm.vector.insert.v128i8.v2i8(<128 x i8> undef, <2 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <128 x i8> @llvm.vector.insert.v128i8.v4i8(<128 x i8> undef, <4 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <128 x i8> @llvm.vector.insert.v128i8.v8i8(<128 x i8> undef, <8 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <128 x i8> @llvm.vector.insert.v128i8.v16i8(<128 x i8> undef, <16 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <128 x i8> @llvm.vector.insert.v128i8.v32i8(<128 x i8> undef, <32 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <128 x i8> @llvm.vector.insert.v128i8.v64i8(<128 x i8> undef, <64 x i8> undef, i64 0)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.insert.v128i8.v128i8(<128 x i8> undef, <128 x i8> undef, i64 0)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; SIZE-LABEL: 'vector_insert_v128i8_undef_0'
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_mf8 = call <128 x i8> @llvm.vector.insert.v128i8.v2i8(<128 x i8> undef, <2 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_mf4 = call <128 x i8> @llvm.vector.insert.v128i8.v4i8(<128 x i8> undef, <4 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_mf2 = call <128 x i8> @llvm.vector.insert.v128i8.v8i8(<128 x i8> undef, <8 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m1 = call <128 x i8> @llvm.vector.insert.v128i8.v16i8(<128 x i8> undef, <16 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m2 = call <128 x i8> @llvm.vector.insert.v128i8.v32i8(<128 x i8> undef, <32 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m4 = call <128 x i8> @llvm.vector.insert.v128i8.v64i8(<128 x i8> undef, <64 x i8> undef, i64 0)
-; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.insert.v128i8.v128i8(<128 x i8> undef, <128 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf8 = call <128 x i8> @llvm.vector.insert.v128i8.v2i8(<128 x i8> undef, <2 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf4 = call <128 x i8> @llvm.vector.insert.v128i8.v4i8(<128 x i8> undef, <4 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_mf2 = call <128 x i8> @llvm.vector.insert.v128i8.v8i8(<128 x i8> undef, <8 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m1 = call <128 x i8> @llvm.vector.insert.v128i8.v16i8(<128 x i8> undef, <16 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m2 = call <128 x i8> @llvm.vector.insert.v128i8.v32i8(<128 x i8> undef, <32 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m4 = call <128 x i8> @llvm.vector.insert.v128i8.v64i8(<128 x i8> undef, <64 x i8> undef, i64 0)
+; SIZE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %fixed_m8 = call <128 x i8> @llvm.vector.insert.v128i8.v128i8(<128 x i8> undef, <128 x i8> undef, i64 0)
 ; SIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret void
 ;
   %fixed_mf8 = call <128 x i8> @llvm.vector.insert.v128i8.v2i8(<128 x i8> undef, <2 x i8> undef, i64 0)



More information about the llvm-commits mailing list