[llvm] [AArch64][LoopVectorize] Enable tail-folding on neoverse-v2 (PR #135357)
via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 11 05:30:36 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-transforms
Author: Cullen Rhodes (c-rhodes)
<details>
<summary>Changes</summary>
This patch enables tail-folding of simple loops by default when targeting the neoverse-v2 CPU. This was done for neoverse-v1 in c7dbe326dff81.
For SPEC2017 with "-Ofast -mcpu=neoverse-v2 -flto" this gives some small wins:
549.fotonik3d_r: ~3.2%
525.x264_r: ~2.7%
554.roms_r: ~1.2%
---
Full diff: https://github.com/llvm/llvm-project/pull/135357.diff
2 Files Affected:
- (modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+2)
- (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll (+2)
``````````diff
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 7b4ded6322098..adee9899f7fd8 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -268,6 +268,8 @@ void AArch64Subtarget::initializeProperties(bool HasMinSize) {
MaxBytesForLoopAlignment = 16;
break;
case NeoverseV2:
+ DefaultSVETFOpts = TailFoldingOpts::Simple;
+ LLVM_FALLTHROUGH;
case NeoverseV3:
EpilogueVectorizationMinVF = 8;
MaxInterleaveFactor = 4;
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
index 7dd0f0c0ad8e0..d2b8dd9c2be48 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
@@ -11,6 +11,8 @@
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -sve-tail-folding=default -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 -sve-tail-folding=default | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v1 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
+; Simple tail-folding is also enabled by default on neoverse-v2. Use same check prefix.
+; RUN: opt < %s -passes=loop-vectorize -sve-tail-folding-insn-threshold=0 -S -mcpu=neoverse-v2 | FileCheck %s -check-prefix=CHECK-NEOVERSE-V1
target triple = "aarch64-unknown-linux-gnu"
``````````
</details>
https://github.com/llvm/llvm-project/pull/135357
More information about the llvm-commits
mailing list