[llvm] [AArch64] Set MaxInterleaving to 4 for Neoverse V2 (PR #100385)
Sjoerd Meijer via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 24 07:10:08 PDT 2024
https://github.com/sjoerdmeijer created https://github.com/llvm/llvm-project/pull/100385
This helps loop based workloads and benchmarks quite a lot, SPEC INT is unaffected.
>From d1b2fd0a6f42f4e7e2e7e3d730fb8e36cc358c4f Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer <smeijer at nvidia.com>
Date: Wed, 24 Jul 2024 17:49:43 +0530
Subject: [PATCH] [AArch64] Set MaxInterleaving to 4 for Neoverse V2
This helps loop based benchmarks quite a lot, SPEC INT is unaffected.
---
llvm/lib/Target/AArch64/AArch64Subtarget.cpp | 4 +++-
.../LoopVectorize/AArch64/interleaving-load-store.ll | 1 +
.../LoopVectorize/AArch64/interleaving-reduction.ll | 1 +
3 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 32a355fe38f1c..280083ae824cd 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -233,9 +233,11 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
PrefLoopAlignment = Align(32);
MaxBytesForLoopAlignment = 16;
break;
+ case NeoverseV2:
+ MaxInterleaveFactor = 4;
+ LLVM_FALLTHROUGH;
case NeoverseN2:
case NeoverseN3:
- case NeoverseV2:
case NeoverseV3:
PrefFunctionAlignment = Align(16);
PrefLoopAlignment = Align(32);
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-load-store.ll b/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-load-store.ll
index 0e54bd15e5ea5..6a47e0ea34a50 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-load-store.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-load-store.ll
@@ -5,6 +5,7 @@
; RUN: opt -passes=loop-vectorize -mtriple=arm64-apple-macos -mcpu=apple-a14 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
; RUN: opt -passes=loop-vectorize -mtriple=arm64-apple-macos -mcpu=apple-a15 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
; RUN: opt -passes=loop-vectorize -mtriple=arm64-apple-macos -mcpu=apple-a16 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
+; RUN: opt -passes=loop-vectorize -mtriple=arm64 -mcpu=neoverse-v2 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
; Tests for selecting interleave counts for loops with loads and stores.
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-reduction.ll b/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-reduction.ll
index 72d528d8748ba..9211cdfaebd15 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-reduction.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/interleaving-reduction.ll
@@ -5,6 +5,7 @@
; RUN: opt -passes=loop-vectorize -mtriple=arm64-apple-macos -mcpu=apple-a14 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
; RUN: opt -passes=loop-vectorize -mtriple=arm64-apple-macos -mcpu=apple-a15 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
; RUN: opt -passes=loop-vectorize -mtriple=arm64-apple-macos -mcpu=apple-a16 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
+; RUN: opt -passes=loop-vectorize -mtriple=arm64 -mcpu=neoverse-v2 -S %s | FileCheck --check-prefix=INTERLEAVE-4 %s
; Tests for selecting the interleave count for loops with reductions.
More information about the llvm-commits
mailing list