[llvm] [AArch64][LoopVectorize] Enable tail-folding on neoverse-v2 (PR #135357)

via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 11 05:30:36 PDT 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-llvm-transforms

Author: Cullen Rhodes (c-rhodes)

<details>
<summary>Changes</summary>

This patch enables tail-folding of simple loops by default when targeting the neoverse-v2 CPU. This was done for neoverse-v1 in c7dbe326dff81.

For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small wins:

549.fotonik3d_r: ~3.2%
     525.x264_r: ~2.7%
     554.roms_r: ~1.2%

---
Full diff: https://github.com/llvm/llvm-project/pull/135357.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+2) 
- (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll (+2) 


``````````diff
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 7b4ded6322098..adee9899f7fd8 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -268,6 +268,8 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
     MaxBytesForLoopAlignment = 16;
     break;
   case NeoverseV2:
+    DefaultSVETFOpts = TailFoldingOpts::Simple;
+    LLVM_FALLTHROUGH;
   case NeoverseV3:
     EpilogueVectorizationMinVF = 8;
     MaxInterleaveFactor = 4;
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
index 7dd0f0c0ad8e0..d2b8dd9c2be48 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
@@ -11,6 +11,8 @@
 ; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -sve-tail-folding=default -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
 ; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 -sve-tail-folding=default | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
 ; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
+; Simple tail-folding is also enabled by default on neoverse-v2. Use same check prefix.
+; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v2 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
 
 target triple = "aarch64-unknown-linux-gnu"
 

``````````

</details>


https://github.com/llvm/llvm-project/pull/135357


More information about the llvm-commits mailing list