[llvm] [AArch64][LoopVectorize] Enable tail-folding on neoverse-v2 (PR #135357)

Cullen Rhodes via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 11 05:30:05 PDT 2025


https://github.com/c-rhodes created https://github.com/llvm/llvm-project/pull/135357

This patch enables tail-folding of simple loops by default when targeting the neoverse-v2 CPU. This was done for neoverse-v1 in c7dbe326dff81.

For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small wins:

549.fotonik3d_r: ~3.2%
     525.x264_r: ~2.7%
     554.roms_r: ~1.2%

>From 99fb0fd2b6ee088c15865ea0e2bfcaf005f06378 Mon Sep 17 00:00:00 2001
From: Cullen Rhodes <cullen.rhodes at arm.com>
Date: Fri, 11 Apr 2025 12:05:13 +0000
Subject: [PATCH] [AArch64][LoopVectorize] Enable tail-folding on neoverse-v2

This patch enables tail-folding of simple loops by default when
targeting the neoverse-v2 CPU. This was done for neoverse-v1 in
c7dbe326dff81.

For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small
wins:

549.fotonik3d_r: ~3.2%
     525.x264_r: ~2.7%
     554.roms_r: ~1.2%
---
 llvm/lib/Target/AArch64/AArch64Subtarget.cpp                    | 2 ++
 .../Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 7b4ded6322098..adee9899f7fd8 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -268,6 +268,8 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
     MaxBytesForLoopAlignment = 16;
     break;
   case NeoverseV2:
+    DefaultSVETFOpts = TailFoldingOpts::Simple;
+    LLVM_FALLTHROUGH;
   case NeoverseV3:
     EpilogueVectorizationMinVF = 8;
     MaxInterleaveFactor = 4;
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
index 7dd0f0c0ad8e0..d2b8dd9c2be48 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
@@ -11,6 +11,8 @@
 ; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -sve-tail-folding=default -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
 ; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 -sve-tail-folding=default | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
 ; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
+; Simple tail-folding is also enabled by default on neoverse-v2. Use same check prefix.
+; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v2 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
 
 target triple = "aarch64-unknown-linux-gnu"
 



More information about the llvm-commits mailing list