[llvm] [RISCV][CostModel] Estimate cost of Extract/InsertElement with non-constant index when vector instructions are not available (PR #67334)

Sergey Kachkov via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 25 07:20:44 PDT 2023


https://github.com/skachkov-sc created https://github.com/llvm/llvm-project/pull/67334

This patch fixes the compilation time issue of matrix-types-spec test from test-suite.

Reproduction of the problem:
```
clang++ -DNDEBUG --target=riscv64-linux-gnu --sysroot=<sysroot path> --gcc-toolchain=<gcc path> -O2 -fenable-matrix <test-suite-path>/SingleSource/UnitTests/matrix-types-spec.cpp
```

On my machine, compilation takes 50.44s. In comparison, the same test with RVV (-march=rv64gcv) compiles in 3.06s, and for x86-64 target it takes 1.71s. It turns out that the main issue is unrolling of loop in multiplySpec function, that has extractelements with non-constant index:
```
for.body9.i:                                      ; preds = %for.body9.i, %for.cond6.preheader.i
  %indvars.iv.i92 = phi i64 [ 0, %for.cond6.preheader.i ], [ %indvars.iv.next.i93, %for.body9.i ]
  %Elt.033.i = phi double [ 0.000000e+00, %for.cond6.preheader.i ], [ %80, %for.body9.i ]
  %77 = mul nuw nsw i64 %indvars.iv.i92, 25
  %78 = add nuw nsw i64 %77, %indvars.iv39.i91
  %matrixext.i = extractelement <475 x double> %62, i64 %78
  %79 = add nuw nsw i64 %indvars.iv.i92, %74
  %matrixext13.i = extractelement <209 x double> %73, i64 %79
  %80 = tail call double @llvm.fmuladd.f64(double %matrixext.i, double %matrixext13.i, double %Elt.033.i)
  %indvars.iv.next.i93 = add nuw nsw i64 %indvars.iv.i92, 1
  %exitcond.not.i94 = icmp eq i64 %indvars.iv.next.i93, 19
  br i1 %exitcond.not.i94, label %for.cond.cleanup8.i, label %for.body9.i, !llvm.loop !21
```

When RVV is supported, extractelement/insertelement with non-constant index can be lowered quite efficiently with vslidedown/vslideup; otherwise it's lowered via stack memory operations, i.e. for extractelement each vector element is stored on stack and then the needed element is loaded back; for insertelement is stores all vector elements, rewrites the required element value and then loads vector back. Currently the cost of such expensive operation is estimated as zero, so loop unroll processes the loop more aggresively. The proper estimation of cost (in a way like in X86 target) prohibits unrolling of this loop and fixes compilation time (2.77s on my machine).


>From af9d71a5271ba72bebb6d67ac622b9bb59f4c474 Mon Sep 17 00:00:00 2001
From: Sergey Kachkov <sergey.kachkov at syntacore.com>
Date: Mon, 25 Sep 2023 16:13:47 +0300
Subject: [PATCH 1/2] [RISCV][CostModel] Add pre-commit test

---
 .../CostModel/RISCV/extractelement.ll         | 114 ++++++++++++++++++
 .../Analysis/CostModel/RISCV/insertelement.ll | 114 ++++++++++++++++++
 2 files changed, 228 insertions(+)
 create mode 100644 llvm/test/Analysis/CostModel/RISCV/extractelement.ll
 create mode 100644 llvm/test/Analysis/CostModel/RISCV/insertelement.ll

diff --git a/llvm/test/Analysis/CostModel/RISCV/extractelement.ll b/llvm/test/Analysis/CostModel/RISCV/extractelement.ll
new file mode 100644
index 000000000000000..b613d4d8348f430
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/RISCV/extractelement.ll
@@ -0,0 +1,114 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 3
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=riscv32 -mattr=+f,+d,+zfh < %s | FileCheck %s --check-prefixes=RV32
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=riscv64 -mattr=+f,+d,+zfh < %s | FileCheck %s --check-prefixes=RV64
+
+define void @extractelement_int(i32 %x) {
+; RV32-LABEL: 'extractelement_int'
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64-LABEL: 'extractelement_int'
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  %v2i8 = extractelement <2 x i8> undef, i32 %x
+  %v4i8 = extractelement <4 x i8> undef, i32 %x
+  %v8i8 = extractelement <8 x i8> undef, i32 %x
+  %v16i8 = extractelement <16 x i8> undef, i32 %x
+
+  %v2i16 = extractelement <2 x i16> undef, i32 %x
+  %v4i16 = extractelement <4 x i16> undef, i32 %x
+  %v8i16 = extractelement <8 x i16> undef, i32 %x
+  %v16i16 = extractelement <16 x i16> undef, i32 %x
+
+  %v2i32 = extractelement <2 x i32> undef, i32 %x
+  %v4i32 = extractelement <4 x i32> undef, i32 %x
+  %v8i32 = extractelement <8 x i32> undef, i32 %x
+  %v16i32 = extractelement <16 x i32> undef, i32 %x
+
+  %v2i64 = extractelement <2 x i64> undef, i32 %x
+  %v4i64 = extractelement <4 x i64> undef, i32 %x
+  %v8i64 = extractelement <8 x i64> undef, i32 %x
+  %v16i64 = extractelement <16 x i64> undef, i32 %x
+
+  ret void
+}
+
+define void @extractelement_fp(i32 %x) {
+; RV32-LABEL: 'extractelement_fp'
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64-LABEL: 'extractelement_fp'
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  %v2f16 = extractelement <2 x half> undef, i32 %x
+  %v4f16 = extractelement <4 x half> undef, i32 %x
+  %v8f16 = extractelement <8 x half> undef, i32 %x
+  %v16f16 = extractelement <16 x half> undef, i32 %x
+
+  %v2f32 = extractelement <2 x float> undef, i32 %x
+  %v4f32 = extractelement <4 x float> undef, i32 %x
+  %v8f32 = extractelement <8 x float> undef, i32 %x
+  %v16f32 = extractelement <16 x float> undef, i32 %x
+
+  %v2f64 = extractelement <2 x double> undef, i32 %x
+  %v4f64 = extractelement <4 x double> undef, i32 %x
+  %v8f64 = extractelement <8 x double> undef, i32 %x
+  %v16f64 = extractelement <16 x double> undef, i32 %x
+
+  ret void
+}
diff --git a/llvm/test/Analysis/CostModel/RISCV/insertelement.ll b/llvm/test/Analysis/CostModel/RISCV/insertelement.ll
new file mode 100644
index 000000000000000..6fcc77f0cea2fb8
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/RISCV/insertelement.ll
@@ -0,0 +1,114 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 3
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=riscv32 -mattr=+f,+d,+zfh < %s | FileCheck %s --check-prefixes=RV32
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=riscv64 -mattr=+f,+d,+zfh < %s | FileCheck %s --check-prefixes=RV64
+
+define void @insertelement_int(i32 %x) {
+; RV32-LABEL: 'insertelement_int'
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64-LABEL: 'insertelement_int'
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
+  %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
+  %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
+  %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
+
+  %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
+  %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
+  %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
+  %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
+
+  %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
+  %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
+  %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
+  %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
+
+  %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
+  %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
+  %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
+  %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
+
+  ret void
+}
+
+define void @insertelement_fp(i32 %x) {
+; RV32-LABEL: 'insertelement_fp'
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64-LABEL: 'insertelement_fp'
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
+  %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
+  %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
+  %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
+
+  %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
+  %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
+  %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
+  %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
+
+  %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
+  %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
+  %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
+  %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
+
+  ret void
+}

>From a863a5dba8ce95be7f64f5429ed9629ee91d2b17 Mon Sep 17 00:00:00 2001
From: Sergey Kachkov <sergey.kachkov at syntacore.com>
Date: Mon, 25 Sep 2023 15:26:09 +0300
Subject: [PATCH 2/2] [RISCV][CostModel] Calculate cost of
 Extract/InsertElement with non-constant index when vector instructions are
 not available

When vector instructions are not available, Extract/InsertElement with N-element
vector types and non-constant indexes are lowered in this way:
ExtractElement: N stores of each element on stack, 1 load of required element
InsertElement: N stores of each element on stack, 1 store to rewrite required
element, N loads of each element back

This patch implements cost calculation of these operations to fix compilation
time of matrix-types-spec test.
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  18 +++
 .../CostModel/RISCV/extractelement.ll         | 112 +++++++++---------
 .../Analysis/CostModel/RISCV/insertelement.ll | 112 +++++++++---------
 3 files changed, 130 insertions(+), 112 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 4a34362a792f150..2df4ea77033f1c6 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1456,6 +1456,24 @@ InstructionCost RISCVTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
       Opcode != Instruction::InsertElement)
     return BaseT::getVectorInstrCost(Opcode, Val, CostKind, Index, Op0, Op1);
 
+  // Extract/InsertElement with non-constant index is very costly without
+  // V instructions; estimate cost of loads/stores sequence via the stack:
+  // ExtractElement cost: store vector to stack, load scalar;
+  // InsertElement cost: store vector to stack, store scalar, load vector.
+  if (Index == -1U && !ST->hasVInstructions()) {
+    auto *VecTy = cast<VectorType>(Val);
+    auto *ElemTy = VecTy->getElementType();
+    auto NumElems = VecTy->getElementCount().getKnownMinValue();
+    auto Align = DL.getPrefTypeAlign(ElemTy);
+    InstructionCost LoadCost =
+        getMemoryOpCost(Instruction::Load, ElemTy, Align, 0, CostKind);
+    InstructionCost StoreCost =
+        getMemoryOpCost(Instruction::Store, ElemTy, Align, 0, CostKind);
+    return Opcode == Instruction::ExtractElement
+               ? StoreCost * NumElems + LoadCost
+               : (StoreCost + LoadCost) * NumElems + StoreCost;
+  }
+
   // Legalize the type.
   std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(Val);
 
diff --git a/llvm/test/Analysis/CostModel/RISCV/extractelement.ll b/llvm/test/Analysis/CostModel/RISCV/extractelement.ll
index b613d4d8348f430..8f9275f717f1176 100644
--- a/llvm/test/Analysis/CostModel/RISCV/extractelement.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/extractelement.ll
@@ -4,41 +4,41 @@
 
 define void @extractelement_int(i32 %x) {
 ; RV32-LABEL: 'extractelement_int'
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 18 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 34 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; RV64-LABEL: 'extractelement_int'
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i8 = extractelement <2 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i8 = extractelement <4 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i8 = extractelement <8 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i8 = extractelement <16 x i8> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i16 = extractelement <2 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i16 = extractelement <4 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i16 = extractelement <8 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i16 = extractelement <16 x i16> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i32 = extractelement <2 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i32 = extractelement <4 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i32 = extractelement <8 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i32 = extractelement <16 x i32> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2i64 = extractelement <2 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4i64 = extractelement <4 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8i64 = extractelement <8 x i64> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16i64 = extractelement <16 x i64> undef, i32 %x
 ; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %v2i8 = extractelement <2 x i8> undef, i32 %x
@@ -66,33 +66,33 @@ define void @extractelement_int(i32 %x) {
 
 define void @extractelement_fp(i32 %x) {
 ; RV32-LABEL: 'extractelement_fp'
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; RV64-LABEL: 'extractelement_fp'
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f16 = extractelement <2 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4f16 = extractelement <4 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8f16 = extractelement <8 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16f16 = extractelement <16 x half> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f32 = extractelement <2 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4f32 = extractelement <4 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8f32 = extractelement <8 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16f32 = extractelement <16 x float> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v2f64 = extractelement <2 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v4f64 = extractelement <4 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v8f64 = extractelement <8 x double> undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v16f64 = extractelement <16 x double> undef, i32 %x
 ; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %v2f16 = extractelement <2 x half> undef, i32 %x
diff --git a/llvm/test/Analysis/CostModel/RISCV/insertelement.ll b/llvm/test/Analysis/CostModel/RISCV/insertelement.ll
index 6fcc77f0cea2fb8..4cdeaf26e19e4d1 100644
--- a/llvm/test/Analysis/CostModel/RISCV/insertelement.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/insertelement.ll
@@ -4,41 +4,41 @@
 
 define void @insertelement_int(i32 %x) {
 ; RV32-LABEL: 'insertelement_int'
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 18 for instruction: %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 34 for instruction: %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 66 for instruction: %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; RV64-LABEL: 'insertelement_int'
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i8 = insertelement <4 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i8 = insertelement <8 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i8 = insertelement <16 x i8> undef, i8 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i16 = insertelement <2 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i16 = insertelement <4 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i16 = insertelement <8 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i16 = insertelement <16 x i16> undef, i16 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i32 = insertelement <2 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i32 = insertelement <4 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i32 = insertelement <8 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i32 = insertelement <16 x i32> undef, i32 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2i64 = insertelement <2 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4i64 = insertelement <4 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8i64 = insertelement <8 x i64> undef, i64 undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16i64 = insertelement <16 x i64> undef, i64 undef, i32 %x
 ; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %v2i8 = insertelement <2 x i8> undef, i8 undef, i32 %x
@@ -66,33 +66,33 @@ define void @insertelement_int(i32 %x) {
 
 define void @insertelement_fp(i32 %x) {
 ; RV32-LABEL: 'insertelement_fp'
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
-; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
+; RV32-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; RV64-LABEL: 'insertelement_fp'
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
-; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2f16 = insertelement <2 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4f16 = insertelement <4 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8f16 = insertelement <8 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16f16 = insertelement <16 x half> undef, half undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2f32 = insertelement <2 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4f32 = insertelement <4 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8f32 = insertelement <8 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16f32 = insertelement <16 x float> undef, float undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v2f64 = insertelement <2 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v4f64 = insertelement <4 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v8f64 = insertelement <8 x double> undef, double undef, i32 %x
+; RV64-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %v16f64 = insertelement <16 x double> undef, double undef, i32 %x
 ; RV64-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %v2f16 = insertelement <2 x half> undef, half undef, i32 %x



More information about the llvm-commits mailing list