[llvm] [AArch64][LoopVectorize] Use upper bound trip count instead of the constant TC when choosing max VF (PR #67697)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 6 07:14:30 PDT 2023
================
@@ -0,0 +1,33 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3
+; RUN: opt -S < %s -passes=loop-vectorize -mtriple aarch64-linux-gnu -mattr=+sve 2>&1 | FileCheck %s
+
+define void @wide_tc_8(ptr nocapture %dst, i32 %n, i64 %val){
+; CHECK-LABEL: define void @wide_tc_8(
+; CHECK: call void @llvm.masked.store.nxv8i8.p0(<vscale x 8 x i8> {{.*}}, ptr {{.*}}, i32 1, <vscale x 8 x i1> {{.*}})
+
+entry:
+ %rem = and i32 %n, 63
+ %cmp8.not = icmp eq i32 %rem, 0
----------------
david-arm wrote:
I think you can probably remove the icmp and br and just combine this block with for.body.preheader.
https://github.com/llvm/llvm-project/pull/67697
More information about the llvm-commits
mailing list