[llvm] 2e7c7d2 - [RISCV][TTI] Adjust cost for extract/insert element when VLEN is known (#108595)

via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 17 08:43:43 PDT 2024


Author: Philip Reames
Date: 2024-09-17T08:43:40-07:00
New Revision: 2e7c7d20d55be51f907d87a2298660d73a1cc190

URL: https://github.com/llvm/llvm-project/commit/2e7c7d20d55be51f907d87a2298660d73a1cc190
DIFF: https://github.com/llvm/llvm-project/commit/2e7c7d20d55be51f907d87a2298660d73a1cc190.diff

LOG: [RISCV][TTI] Adjust cost for extract/insert element when VLEN is known (#108595)

If we know an exact VLEN, then the index is effectively modulo the
number of elements in a single vector register. Our lowering performs
this subvector optimization.

A bit of context. This change may look a bit strange on it's own given
we are currently *not* scaling insert/extract cost by LMUL. This costing
decision needs to change, but is very intertwined with SLP
profitability, and is thus a bit hard to adjust. I'm hoping that
https://github.com/llvm/llvm-project/pull/108419 will let me start to
untangle this. This change is basically a case of finding a subset I can
tackle before other dependencies are in place which does no real harm in
the meantime.

Added: 
    

Modified: 
    llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
    llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
    llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 0e065111385299..5d280b44630aef 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1756,6 +1756,14 @@ InstructionCost RISCVTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val,
       Index = Index % Width;
     }
 
+    // If exact VLEN is known, we will insert/extract into the appropriate
+    // subvector with no additional subvector insert/extract cost.
+    if (auto VLEN = ST->getRealVLen()) {
+      unsigned EltSize = LT.second.getScalarSizeInBits();
+      unsigned M1Max = *VLEN / EltSize;
+      Index = Index % M1Max;
+    }
+
     // We could extract/insert the first element without vslidedown/vslideup.
     if (Index == 0)
       SlideCost = 0;

diff  --git a/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
index 4a9e30888cdd1f..618b7bc8945a50 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-extractelement.ll
@@ -1501,3 +1501,55 @@ define void @extractelement_int_nonpoweroftwo(i32 %x) {
 
   ret void
 }
+
+define void @extractelement_vls(i32 %x) vscale_range(2,2) {
+; RV32V-LABEL: 'extractelement_vls'
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64V-LABEL: 'extractelement_vls'
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV32ZVE64X-LABEL: 'extractelement_vls'
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64ZVE64X-LABEL: 'extractelement_vls'
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = extractelement <32 x i32> undef, i32 0
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = extractelement <32 x i32> undef, i32 4
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = extractelement <32 x i32> undef, i32 5
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = extractelement <32 x i32> undef, i32 8
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = extractelement <32 x i32> undef, i32 9
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = extractelement <32 x i32> undef, i32 11
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = extractelement <32 x i32> undef, i32 12
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  %v32i32_0 = extractelement <32 x i32> undef, i32 0
+  %v32i32_4 = extractelement <32 x i32> undef, i32 4
+  %v32i32_5 = extractelement <32 x i32> undef, i32 5
+  %v32i32_8 = extractelement <32 x i32> undef, i32 8
+  %v32i32_9 = extractelement <32 x i32> undef, i32 9
+  %v32i32_11 = extractelement <32 x i32> undef, i32 11
+  %v32i32_12 = extractelement <32 x i32> undef, i32 12
+
+  ret void
+}

diff  --git a/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll b/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll
index 0616e0919b9d9a..c240a75066b10c 100644
--- a/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/rvv-insertelement.ll
@@ -1491,3 +1491,55 @@ define void @insertelement_int_nonpoweroftwo(i32 %x) {
 
   ret void
 }
+
+define void @insertelement_vls(i32 %x) vscale_range(2,2) {
+; RV32V-LABEL: 'insertelement_vls'
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV32V-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64V-LABEL: 'insertelement_vls'
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV32ZVE64X-LABEL: 'insertelement_vls'
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV32ZVE64X-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64ZVE64X-LABEL: 'insertelement_vls'
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+; RV64ZVE64X-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  %v32i32_0 = insertelement <32 x i32> undef, i32 undef, i32 0
+  %v32i32_4 = insertelement <32 x i32> undef, i32 undef, i32 4
+  %v32i32_5 = insertelement <32 x i32> undef, i32 undef, i32 5
+  %v32i32_8 = insertelement <32 x i32> undef, i32 undef, i32 8
+  %v32i32_9 = insertelement <32 x i32> undef, i32 undef, i32 9
+  %v32i32_11 = insertelement <32 x i32> undef, i32 undef, i32 11
+  %v32i32_12 = insertelement <32 x i32> undef, i32 undef, i32 12
+
+  ret void
+}


        


More information about the llvm-commits mailing list