[llvm] [SLP] Fix : Do not skip profitable small VFs in Vectorize Stores (PR #177100)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 21 10:49:22 PST 2026
================
@@ -0,0 +1,78 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -passes=slp-vectorizer -mtriple=riscv64 -mattr=+v -S | FileCheck %s
+
+define void @test_max_reg_vf_boundary(ptr %pl, ptr %ps) {
+; CHECK-LABEL: define void @test_max_reg_vf_boundary(
+; CHECK-SAME: ptr [[PL:%.*]], ptr [[PS:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[GEP_L_UNRELATED_1:%.*]] = getelementptr inbounds i32, ptr [[PL]], i32 100
+; CHECK-NEXT: [[GEP_L_UNRELATED_2:%.*]] = getelementptr inbounds i32, ptr [[PL]], i32 200
+; CHECK-NEXT: [[GEP_L_CONTIGUOUS:%.*]] = getelementptr inbounds i32, ptr [[PL]], i32 2
+; CHECK-NEXT: [[GEP_L_OP_MISMATCH_1:%.*]] = getelementptr inbounds i32, ptr [[PL]], i32 300
+; CHECK-NEXT: [[GEP_L_OP_MISMATCH_2:%.*]] = getelementptr inbounds i32, ptr [[PL]], i32 400
+; CHECK-NEXT: [[LOAD0:%.*]] = load i32, ptr [[GEP_L_UNRELATED_1]], align 4
+; CHECK-NEXT: [[LOAD1:%.*]] = load i32, ptr [[GEP_L_UNRELATED_2]], align 4
+; CHECK-NEXT: [[LOAD6:%.*]] = load i32, ptr [[GEP_L_OP_MISMATCH_1]], align 4
+; CHECK-NEXT: [[LOAD7:%.*]] = load i32, ptr [[GEP_L_OP_MISMATCH_2]], align 4
+; CHECK-NEXT: [[ADD6:%.*]] = add i32 [[LOAD6]], 1
+; CHECK-NEXT: [[ADD7:%.*]] = add i32 [[LOAD7]], 1
+; CHECK-NEXT: [[GEP_S0:%.*]] = getelementptr inbounds i32, ptr [[PS]], i32 0
+; CHECK-NEXT: [[GEP_S1:%.*]] = getelementptr inbounds i32, ptr [[PS]], i32 1
+; CHECK-NEXT: [[GEP_S2:%.*]] = getelementptr inbounds i32, ptr [[PS]], i32 2
+; CHECK-NEXT: [[GEP_S6:%.*]] = getelementptr inbounds i32, ptr [[PS]], i32 6
+; CHECK-NEXT: [[GEP_S7:%.*]] = getelementptr inbounds i32, ptr [[PS]], i32 7
+; CHECK-NEXT: [[TMP1:%.*]] = load <4 x i32>, ptr [[GEP_L_CONTIGUOUS]], align 4
+; CHECK-NEXT: store i32 [[LOAD0]], ptr [[GEP_S0]], align 4
+; CHECK-NEXT: store i32 [[LOAD1]], ptr [[GEP_S1]], align 4
+; CHECK-NEXT: store <4 x i32> [[TMP1]], ptr [[GEP_S2]], align 4
+; CHECK-NEXT: store i32 [[ADD6]], ptr [[GEP_S6]], align 4
+; CHECK-NEXT: store i32 [[ADD7]], ptr [[GEP_S7]], align 4
+; CHECK-NEXT: ret void
+;
+; ensuring maxregVF slice is vectorized correctly even with the mixed tree sizes
+
+ ; random offsets scalar tests
+ %gep_l_unrelated_1 = getelementptr inbounds i32, ptr %pl, i32 100
+ %gep_l_unrelated_2 = getelementptr inbounds i32, ptr %pl, i32 200
+
+ ; vf = maxregvf tests
+ %gep_l_contiguous = getelementptr inbounds i32, ptr %pl, i32 2
+ %gep_l3 = getelementptr inbounds i32, ptr %pl, i32 3
+ %gep_l4 = getelementptr inbounds i32, ptr %pl, i32 4
+ %gep_l5 = getelementptr inbounds i32, ptr %pl, i32 5
+
+ ; forcing differing tree sizes
+ %gep_l_op_mismatch_1 = getelementptr inbounds i32, ptr %pl, i32 300
+ %gep_l_op_mismatch_2 = getelementptr inbounds i32, ptr %pl, i32 400
+
----------------
Soumik15630 wrote:
Agreed- offsets alone don't affect tree sizes , anyway removed the comment
thanks for pointing out
https://github.com/llvm/llvm-project/pull/177100
More information about the llvm-commits
mailing list