[llvm] [RISCV] Account for factor in interleave memory op costs (PR #111511)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 8 02:49:18 PDT 2024
https://github.com/lukel97 created https://github.com/llvm/llvm-project/pull/111511
Currently we cost an interleaved memory op as if it were a load/store of the widened vector type.
However this doesn't take into account that we'll most likely need to perform at least Factor uops because we're writing/reading from Factor number of registers.
E.g. Today an i8 VF=2 Factor=8 interleave is costed as a single LMUL=1 op with +zvl128b, because the widened type is <16 x i8>.
This changes it to be calculated as <2 x i8> * Factor=8, i.e. 8 LMUL=1 ops.
Thankfully the FIXME about illegal vectors seems to have been fixed in #100436, and even then I think the LT.first should have been multiplied, not added.
Note we still have a quirk where the loop vectorizer will happily emit interleaved accesses for what could be strided accesses, because the costs are break-even in LoopVectorizationCostModel::setCostBasedWideningDecision:
void f(int8_t* a, int n) {
for (int i = 0; i < n; i++) {
a[i * 2] += 1;
}
}
vsetvli t1, zero, e8, m2, ta, ma
vlseg2e8.v v24, (t0)
vadd.vi v24, v24, 1
vsse8.v v24, (a6), a5
I think we may need to either adjust the cost or add a hook to get the loop vectorizer to s
>From fba93ab9791b6189332c6f74622b96d73170a4f3 Mon Sep 17 00:00:00 2001
From: Luke Lau <luke at igalia.com>
Date: Tue, 8 Oct 2024 17:35:31 +0800
Subject: [PATCH] [RISCV] Account for factor in interleave memory op costs
Currently we cost an interleaved memory op as if it were a load/store of the widened vector type.
However this doesn't take into account that we'll most likely need to perform at least Factor uops because we're writing/reading from Factor number of registers.
E.g. Today an i8 VF=2 Factor=8 interleave is costed as a single LMUL=1 op with +zvl128b, because the widened type is <16 x i8>.
This changes it to be calculated as <2 x i8> * Factor=8, i.e. 8 LMUL=1 ops.
Thankfully the FIXME about illegal vectors seems to have been fixed in #100436, and even then I think the LT.first should have been multiplied, not added.
Note we still have a quirk where the loop vectorizer will happily emit interleaved accesses for what could be strided accesses, because the costs are break-even in LoopVectorizationCostModel::setCostBasedWideningDecision:
void f(int8_t* a, int n) {
for (int i = 0; i < n; i++) {
a[i * 2] += 1;
}
}
vsetvli t1, zero, e8, m2, ta, ma
vlseg2e8.v v24, (t0)
vadd.vi v24, v24, 1
vsse8.v v24, (a6), a5
I think we may need to either adjust the cost or add a hook to get the loop vectorizer to s
---
.../Target/RISCV/RISCVTargetTransformInfo.cpp | 12 +-
.../LoopVectorize/RISCV/interleaved-cost.ll | 112 +++++++++---------
2 files changed, 58 insertions(+), 66 deletions(-)
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index a61461681f79ed..608298903cace0 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -691,19 +691,11 @@ InstructionCost RISCVTTIImpl::getInterleavedMemoryOpCost(
auto *SubVecTy =
VectorType::get(VTy->getElementType(),
VTy->getElementCount().divideCoefficientBy(Factor));
-
if (VTy->getElementCount().isKnownMultipleOf(Factor) &&
TLI->isLegalInterleavedAccessType(SubVecTy, Factor, Alignment,
AddressSpace, DL)) {
- // FIXME: We use the memory op cost of the *legalized* type here,
- // because it's getMemoryOpCost returns a really expensive cost for
- // types like <6 x i8>, which show up when doing interleaves of
- // Factor=3 etc. Should the memory op cost of these be cheaper?
- auto *LegalVTy = VectorType::get(VTy->getElementType(),
- LT.second.getVectorElementCount());
- InstructionCost LegalMemCost = getMemoryOpCost(
- Opcode, LegalVTy, Alignment, AddressSpace, CostKind);
- return LT.first + LegalMemCost;
+ return Factor * getMemoryOpCost(Opcode, SubVecTy, Alignment,
+ AddressSpace, CostKind);
}
}
}
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/interleaved-cost.ll b/llvm/test/Transforms/LoopVectorize/RISCV/interleaved-cost.ll
index fa346b4eac02d4..34285b26c13875 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/interleaved-cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/interleaved-cost.ll
@@ -12,20 +12,20 @@ entry:
; CHECK: Cost of 2 for VF 4: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
; CHECK: Cost of 2 for VF 8: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
; CHECK: Cost of 2 for VF 8: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF 16: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF 16: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF 32: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF 32: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
+; CHECK: Cost of 2 for VF 16: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
+; CHECK: Cost of 2 for VF 16: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
+; CHECK: Cost of 4 for VF 32: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
+; CHECK: Cost of 4 for VF 32: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
; CHECK: Cost of 2 for VF vscale x 1: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
; CHECK: Cost of 2 for VF vscale x 1: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
; CHECK: Cost of 2 for VF vscale x 2: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
; CHECK: Cost of 2 for VF vscale x 2: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
; CHECK: Cost of 2 for VF vscale x 4: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
; CHECK: Cost of 2 for VF vscale x 4: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF vscale x 8: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF vscale x 8: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF vscale x 16: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF vscale x 16: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
+; CHECK: Cost of 2 for VF vscale x 8: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
+; CHECK: Cost of 2 for VF vscale x 8: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
+; CHECK: Cost of 4 for VF vscale x 16: INTERLEAVE-GROUP with factor 2 at %l0, ir<%p0>
+; CHECK: Cost of 4 for VF vscale x 16: INTERLEAVE-GROUP with factor 2 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.2, ptr %data, i64 %i, i32 0
@@ -49,16 +49,16 @@ define void @i8_factor_3(ptr %data, i64 %n) {
entry:
br label %for.body
; CHECK-LABEL: Checking a loop in 'i8_factor_3'
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
-; CHECK: Cost of 2 for VF 4: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 4: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
+; CHECK: Cost of 3 for VF 2: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
+; CHECK: Cost of 3 for VF 2: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
+; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
+; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
; CHECK: Cost of 3 for VF 8: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
; CHECK: Cost of 3 for VF 8: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF 16: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF 16: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
-; CHECK: Cost of 9 for VF 32: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
-; CHECK: Cost of 9 for VF 32: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
+; CHECK: Cost of 3 for VF 16: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
+; CHECK: Cost of 3 for VF 16: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
+; CHECK: Cost of 6 for VF 32: INTERLEAVE-GROUP with factor 3 at %l0, ir<%p0>
+; CHECK: Cost of 6 for VF 32: INTERLEAVE-GROUP with factor 3 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.3, ptr %data, i64 %i, i32 0
@@ -86,16 +86,16 @@ define void @i8_factor_4(ptr %data, i64 %n) {
entry:
br label %for.body
; CHECK-LABEL: Checking a loop in 'i8_factor_4'
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
-; CHECK: Cost of 2 for VF 4: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 4: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF 8: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF 8: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF 16: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF 16: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
-; CHECK: Cost of 9 for VF 32: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
-; CHECK: Cost of 9 for VF 32: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
+; CHECK: Cost of 4 for VF 2: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
+; CHECK: Cost of 4 for VF 2: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
+; CHECK: Cost of 4 for VF 4: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
+; CHECK: Cost of 4 for VF 4: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
+; CHECK: Cost of 4 for VF 8: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
+; CHECK: Cost of 4 for VF 8: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
+; CHECK: Cost of 4 for VF 16: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
+; CHECK: Cost of 4 for VF 16: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
+; CHECK: Cost of 8 for VF 32: INTERLEAVE-GROUP with factor 4 at %l0, ir<%p0>
+; CHECK: Cost of 8 for VF 32: INTERLEAVE-GROUP with factor 4 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.4, ptr %data, i64 %i, i32 0
@@ -127,14 +127,14 @@ define void @i8_factor_5(ptr %data, i64 %n) {
entry:
br label %for.body
; CHECK-LABEL: Checking a loop in 'i8_factor_5'
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
+; CHECK: Cost of 5 for VF 2: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
+; CHECK: Cost of 5 for VF 2: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
+; CHECK: Cost of 5 for VF 4: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
+; CHECK: Cost of 5 for VF 4: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
+; CHECK: Cost of 5 for VF 16: INTERLEAVE-GROUP with factor 5 at %l0, ir<%p0>
+; CHECK: Cost of 5 for VF 16: INTERLEAVE-GROUP with factor 5 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.5, ptr %data, i64 %i, i32 0
@@ -170,14 +170,14 @@ define void @i8_factor_6(ptr %data, i64 %n) {
entry:
br label %for.body
; CHECK-LABEL: Checking a loop in 'i8_factor_6'
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
+; CHECK: Cost of 6 for VF 2: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
+; CHECK: Cost of 6 for VF 2: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
+; CHECK: Cost of 6 for VF 4: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
+; CHECK: Cost of 6 for VF 4: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
+; CHECK: Cost of 6 for VF 8: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
+; CHECK: Cost of 6 for VF 8: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
+; CHECK: Cost of 6 for VF 16: INTERLEAVE-GROUP with factor 6 at %l0, ir<%p0>
+; CHECK: Cost of 6 for VF 16: INTERLEAVE-GROUP with factor 6 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.6, ptr %data, i64 %i, i32 0
@@ -217,14 +217,14 @@ define void @i8_factor_7(ptr %data, i64 %n) {
entry:
br label %for.body
; CHECK-LABEL: Checking a loop in 'i8_factor_7'
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
+; CHECK: Cost of 7 for VF 2: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
+; CHECK: Cost of 7 for VF 2: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
+; CHECK: Cost of 7 for VF 4: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
+; CHECK: Cost of 7 for VF 4: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
+; CHECK: Cost of 7 for VF 8: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
+; CHECK: Cost of 7 for VF 8: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
+; CHECK: Cost of 7 for VF 16: INTERLEAVE-GROUP with factor 7 at %l0, ir<%p0>
+; CHECK: Cost of 7 for VF 16: INTERLEAVE-GROUP with factor 7 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.7, ptr %data, i64 %i, i32 0
@@ -268,14 +268,14 @@ define void @i8_factor_8(ptr %data, i64 %n) {
entry:
br label %for.body
; CHECK-LABEL: Checking a loop in 'i8_factor_8'
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
-; CHECK: Cost of 2 for VF 2: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
-; CHECK: Cost of 3 for VF 4: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
-; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
-; CHECK: Cost of 5 for VF 8: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
-; CHECK: Cost of 9 for VF 16: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
+; CHECK: Cost of 8 for VF 2: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
+; CHECK: Cost of 8 for VF 2: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
+; CHECK: Cost of 8 for VF 4: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
+; CHECK: Cost of 8 for VF 4: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
+; CHECK: Cost of 8 for VF 8: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
+; CHECK: Cost of 8 for VF 8: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
+; CHECK: Cost of 8 for VF 16: INTERLEAVE-GROUP with factor 8 at %l0, ir<%p0>
+; CHECK: Cost of 8 for VF 16: INTERLEAVE-GROUP with factor 8 at <badref>, ir<%p0>
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%p0 = getelementptr inbounds %i8.8, ptr %data, i64 %i, i32 0
More information about the llvm-commits
mailing list