[llvm] [RISCV][TTI] Adjust cost for extract/insert element when VLEN is known (PR #108595)
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 13 09:04:17 PDT 2024
https://github.com/preames created https://github.com/llvm/llvm-project/pull/108595
If we know an exact VLEN, then the index is effectively modulo the number of elements in a single vector register. Our lowering performs this subvector optimization.
A bit of context. This change may look a bit strange on it's own given we are currently *not* scaling insert/extract cost by LMUL. This costing decision needs to change, but is very intertwined with SLP profitability, and is thus a bit hard to adjust. I'm hoping that https://github.com/llvm/llvm-project/pull/108419 will let me start to untangle this. This change is basically a case of finding a subset I can tackle before other dependencies are in place which does no real harm in the meantime.
>From fb58d0b118307a21f768d01affaa064f8f06fab5 Mon Sep 17 00:00:00 2001
From: Philip Reames <preames at rivosinc.com>
Date: Fri, 13 Sep 2024 08:55:03 -0700
Subject: [PATCH] [RISCV][TTI] Adjust cost for extract/insert element when VLEN
is known
If we know an exact VLEN, then the index is effectively modulo the
number of elements in a single vector register. Our lowering performs
this subvector optimization.
A bit of context. This change may look a bit strange on it's own
given we are currently *not* scaling insert/extract cost by LMUL.
This costing decision needs to change, but is very intertwined with
SLP profitability, and is thus a bit hard to adjust. I'm hoping
that https://github.com/llvm/llvm-project/pull/108419 will let me
start to untangle this. This change is basically a case of finding
a subset I can tackle before other dependencies are in place which
does no real harm in the meantime.
---
.../Target/RISCV/RISCVTargetTransformInfo.cpp | 8 +++
.../CostModel/RISCV/rvv-extractelement.ll | 52 +++++++++++++++++++
.../CostModel/RISCV/rvv-insertelement.ll | 52 +++++++++++++++++++
3 files changed, 112 insertions(+)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 2b5e7c47279284..470ca63ba5c875 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1756,6 +1756,14 @@ InstructionCost RISCVTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
Index = Index % Width;
}
+ // If exact VLEN is known, we will insert/extract into the appropriate
+ // subvector with no additional subvector insert/extract cost.
+ if (auto VLEN = ST->getRealVLen()) {
+ unsigned EltSize = LT.second.getScalarSizeInBits();
+ unsigned M1Max = *VLEN / EltSize;
+ Index = Index % M1Max;
+ }
+
// We could extract/insert the first element without vslidedown/vslideup.
if (Index == 0)
SlideCost = 0;
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
index 4a9e30888cdd1f..618b7bc8945a50 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
@@ -1501,3 +1501,55 @@ define void @extractelement_int_nonpoweroftwo(i32 %x) {
ret void
}
+
+define void @extractelement_vls(i32 %x) vscale_range(2,2) {
+; RV32V-LABEL: 'extractelement_vls'
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV32V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64V-LABEL: 'extractelement_vls'
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV64V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV32ZVE64X-LABEL: 'extractelement_vls'
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64ZVE64X-LABEL: 'extractelement_vls'
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ %v32i32_0 = extractelement <32 x i32> undef, i32 0
+ %v32i32_4 = extractelement <32 x i32> undef, i32 4
+ %v32i32_5 = extractelement <32 x i32> undef, i32 5
+ %v32i32_8 = extractelement <32 x i32> undef, i32 8
+ %v32i32_9 = extractelement <32 x i32> undef, i32 9
+ %v32i32_11 = extractelement <32 x i32> undef, i32 11
+ %v32i32_12 = extractelement <32 x i32> undef, i32 12
+
+ ret void
+}
diff --git a/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll
index 0616e0919b9d9a..c240a75066b10c 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll
@@ -1491,3 +1491,55 @@ define void @insertelement_int_nonpoweroftwo(i32 %x) {
ret void
}
+
+define void @insertelement_vls(i32 %x) vscale_range(2,2) {
+; RV32V-LABEL: 'insertelement_vls'
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV32V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV32V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV32V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64V-LABEL: 'insertelement_vls'
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV64V-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV64V-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV64V-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV32ZVE64X-LABEL: 'insertelement_vls'
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV32ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64ZVE64X-LABEL: 'insertelement_vls'
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV64ZVE64X-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+ %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+ %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+ %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+ %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+ %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+ %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+
+ ret void
+}
More information about the llvm-commits
mailing list