[llvm] [AArch64][LoopVectorize] Enable tail-folding on neoverse-v2 (PR #135357)
Cullen Rhodes via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 11 05:30:05 PDT 2025
https://github.com/c-rhodes created https://github.com/llvm/llvm-project/pull/135357
This patch enables tail-folding of simple loops by default when targeting the neoverse-v2 CPU. This was done for neoverse-v1 in c7dbe326dff81.
For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small wins:
549.fotonik3d_r: ~3.2%
525.x264_r: ~2.7%
554.roms_r: ~1.2%
>From 99fb0fd2b6ee088c15865ea0e2bfcaf005f06378 Mon Sep 17 00:00:00 2001
From: Cullen Rhodes <cullen.rhodes at arm.com>
Date: Fri, 11 Apr 2025 12:05:13 +0000
Subject: [PATCH] [AArch64][LoopVectorize] Enable tail-folding on neoverse-v2
This patch enables tail-folding of simple loops by default when
targeting the neoverse-v2 CPU. This was done for neoverse-v1 in
c7dbe326dff81.
For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small
wins:
549.fotonik3d_r: ~3.2%
525.x264_r: ~2.7%
554.roms_r: ~1.2%
---
llvm/lib/Target/AArch64/AArch64Subtarget.cpp | 2 ++
.../Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll | 2 ++
2 files changed, 4 insertions(+)
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 7b4ded6322098..adee9899f7fd8 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -268,6 +268,8 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
MaxBytesForLoopAlignment = 16;
break;
case NeoverseV2:
+ DefaultSVETFOpts = TailFoldingOpts::Simple;
+ LLVM_FALLTHROUGH;
case NeoverseV3:
EpilogueVectorizationMinVF = 8;
MaxInterleaveFactor = 4;
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
index 7dd0f0c0ad8e0..d2b8dd9c2be48 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
@@ -11,6 +11,8 @@
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -sve-tail-folding=default -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 -sve-tail-folding=default | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
+; Simple tail-folding is also enabled by default on neoverse-v2. Use same check prefix.
+; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v2 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
target triple = "aarch64-unknown-linux-gnu"
More information about the llvm-commits
mailing list