[llvm] [LV][VPlan] Implement VPlan-based cost for exit condition. (PR #125640)

Elvis Wang via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 7 22:50:54 PDT 2025


================
@@ -27,23 +27,30 @@ define i64 @test_value_in_exit_compare_chain_used_outside(ptr %src, i64 %x, i64
 ; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
 ; CHECK:       [[VECTOR_BODY]]:
 ; CHECK-NEXT:    [[TMP10:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <8 x i8> [ zeroinitializer, %[[VECTOR_PH]] ], [ [[TMP29:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <4 x i8> [ zeroinitializer, %[[VECTOR_PH]] ], [ [[TMP17:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[VEC_PHI3:%.*]] = phi <4 x i8> [ zeroinitializer, %[[VECTOR_PH]] ], [ [[TMP19:%.*]], %[[VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[TMP18:%.*]] = and i64 [[TMP10]], 1
 ; CHECK-NEXT:    [[TMP26:%.*]] = getelementptr i8, ptr [[SRC]], i64 [[TMP18]]
 ; CHECK-NEXT:    [[TMP27:%.*]] = getelementptr i8, ptr [[TMP26]], i32 0
-; CHECK-NEXT:    [[TMP28:%.*]] = getelementptr i8, ptr [[TMP27]], i32 -7
-; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <8 x i8>, ptr [[TMP28]], align 1
-; CHECK-NEXT:    [[REVERSE:%.*]] = shufflevector <8 x i8> [[WIDE_LOAD]], <8 x i8> poison, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
-; CHECK-NEXT:    [[TMP29]] = xor <8 x i8> [[REVERSE]], [[VEC_PHI]]
+; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr i8, ptr [[TMP27]], i32 -3
----------------
ElvisWang123 wrote:

This loop will choosing the smaller VF because the TC is relative small (31) and no tail-folding.

Check the cost of each VF and TC = 31:
```
LV: Scalar loop costs: 7.         => cost = 217
Cost for VF 2: 8 (Estimated cost per lane: 4.0) => cost = 127
Cost for VF 4: 9 (Estimated cost per lane: 2.2) => cost = 84
Cost for VF 8: 13 (Estimated cost per lane: 1.6) => cost = 88
Cost for VF 16: 21 (Estimated cost per lane: 1.3) => cost = 126
```

Note that without this patch, LV calculated 2 Icmp + 1 add but these instruction will not generated in the vector.body.
```
Cost of 1 for VF 4: exit condition instruction   %cmp = icmp eq i64 %x.inc, 0
Cost of 1 for VF 4: exit condition instruction   %ec = icmp eq i64 %iv.next, %N
Cost of 2 for VF 4: exit condition instruction   %x.inc = add i64 %iv.and, %x
```
```
vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi <8 x i8> [ zeroinitializer, %vector.ph ], [ %14, %vector.body ]
  %10 = and i64 %index, 1
  %11 = getelementptr i8, ptr %src, i64 %10
  %12 = getelementptr i8, ptr %11, i32 0
  %13 = getelementptr i8, ptr %12, i32 -7
  %wide.load = load <8 x i8>, ptr %13, align 1
  %reverse = shufflevector <8 x i8> %wide.load, <8 x i8> poison, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
  %14 = xor <8 x i8> %reverse, %vec.phi
  %index.next = add nuw i64 %index, 8
  %15 = icmp eq i64 %index.next, %n.vec
  br i1 %15, label %middle.block, label %vector.body, !llvm.loop !0
```


https://github.com/llvm/llvm-project/pull/125640


More information about the llvm-commits mailing list