[llvm] [AArch64] Increase scatter overhead on Neoverse-V2 (PR #101296)

Madhur Amilkanthwar via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 31 00:22:14 PDT 2024


https://github.com/madhur13490 created https://github.com/llvm/llvm-project/pull/101296

This patch increases scatter overhead on Neoverse-V2 to 13. This benefits s128 kernel from TSVC_2 test suite.
SPEC 17, RAJAPerf, Sptter are unaffected with this patch.

>From 81494cd0eed790dc2586329e08526426af32b436 Mon Sep 17 00:00:00 2001
From: Madhur Amilkanthwar <madhura at nvidia.com>
Date: Thu, 25 Jul 2024 18:30:21 +0530
Subject: [PATCH] [AArch64] Increase scatter overhead on Neoverse-V2

This patch increases scatter overhead on Neoverse-V2 to 13.
This benefits s128 kernel from TSVC_2 test suite.
SPEC 17, RAJAPerf, Sptter are unaffected with this patch.
---
 llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | 10 +++++++++-
 .../Transforms/LoopVectorize/AArch64/scatter-cost.ll   | 10 ++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/LoopVectorize/AArch64/scatter-cost.ll

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 79c0e45e3aa5b..08169d31aca67 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -3424,7 +3424,15 @@ InstructionCost AArch64TTIImpl::getGatherScatterOpCost(
   // Add on an overhead cost for using gathers/scatters.
   // TODO: At the moment this is applied unilaterally for all CPUs, but at some
   // point we may want a per-CPU overhead.
-  MemOpCost *= getSVEGatherScatterOverhead(Opcode);
+  unsigned OpCost = 1;
+  if (ST->getProcFamily() == AArch64Subtarget::NeoverseV2) {
+     // Specialize overhead of scatter instructions on Neoverse-V2
+    if (Opcode == Instruction::Store)
+      OpCost = 13;
+  } else {
+    OpCost = getSVEGatherScatterOverhead(Opcode);
+  }
+  MemOpCost *= OpCost;
   return LT.first * MemOpCost * getMaxNumElements(LegalVF);
 }
 
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/scatter-cost.ll b/llvm/test/Transforms/LoopVectorize/AArch64/scatter-cost.ll
new file mode 100644
index 0000000000000..5c890bab1e37b
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/scatter-cost.ll
@@ -0,0 +1,10 @@
+; RUN: opt -mtriple aarch64 -mcpu=neoverse-v2 -passes="print<cost-model>" -disable-output | FileCheck %s
+; CHECK: Cost Model: Found an estimated cost of 52 for instruction: call void @llvm.masked.scatter.nxv4f32
+
+define void @masked_scatter_nxv8f32_i64(<vscale x 4 x float> %data, <vscale x 4 x ptr> %b, <vscale x 4 x i64> %V) #0 {
+;%1 = add nsw <vscale x 4 x i64> %V, shufflevector (<vscale x 4 x i64> insertelement (<vscale x 4 x i64> poison, i64 1, i64 0), <vscale x 4 x i64> poison, <vscale x 4 x i32> zeroinitializer)
+;%ptrs = getelementptr float, ptr %b, <vscale x 4 x i64> %1
+call void @llvm.masked.scatter.nxv4f32.nxv4p0(<vscale x 4 x float> %data, <vscale x 4 x ptr> %b, i32 4, <vscale x 4 x i1> shufflevector (<vscale x 4 x i1> insertelement (<vscale x 4 x i1> poison, i1 true, i64 0), <vscale x 4 x i1> poison, <vscale x 4 x i32> zeroinitializer))
+ret void
+}
+



More information about the llvm-commits mailing list