[llvm] [LV][VPlan] Implement VPlan-based cost for exit condition. (PR #125640)
Elvis Wang via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 7 22:50:54 PDT 2025
================
@@ -27,23 +27,30 @@ define i64 @test_value_in_exit_compare_chain_used_outside(ptr %src, i64 %x, i64
; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
; CHECK: [[VECTOR_BODY]]:
; CHECK-NEXT: [[TMP10:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT: [[VEC_PHI:%.*]] = phi <8 x i8> [ zeroinitializer, %[[VECTOR_PH]] ], [ [[TMP29:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_PHI:%.*]] = phi <4 x i8> [ zeroinitializer, %[[VECTOR_PH]] ], [ [[TMP17:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_PHI3:%.*]] = phi <4 x i8> [ zeroinitializer, %[[VECTOR_PH]] ], [ [[TMP19:%.*]], %[[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP18:%.*]] = and i64 [[TMP10]], 1
; CHECK-NEXT: [[TMP26:%.*]] = getelementptr i8, ptr [[SRC]], i64 [[TMP18]]
; CHECK-NEXT: [[TMP27:%.*]] = getelementptr i8, ptr [[TMP26]], i32 0
-; CHECK-NEXT: [[TMP28:%.*]] = getelementptr i8, ptr [[TMP27]], i32 -7
-; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <8 x i8>, ptr [[TMP28]], align 1
-; CHECK-NEXT: [[REVERSE:%.*]] = shufflevector <8 x i8> [[WIDE_LOAD]], <8 x i8> poison, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
-; CHECK-NEXT: [[TMP29]] = xor <8 x i8> [[REVERSE]], [[VEC_PHI]]
+; CHECK-NEXT: [[TMP13:%.*]] = getelementptr i8, ptr [[TMP27]], i32 -3
----------------
ElvisWang123 wrote:
This loop will choosing the smaller VF because the TC is relative small (31) and no tail-folding.
Check the cost of each VF and TC = 31:
```
LV: Scalar loop costs: 7. => cost = 217
Cost for VF 2: 8 (Estimated cost per lane: 4.0) => cost = 127
Cost for VF 4: 9 (Estimated cost per lane: 2.2) => cost = 84
Cost for VF 8: 13 (Estimated cost per lane: 1.6) => cost = 88
Cost for VF 16: 21 (Estimated cost per lane: 1.3) => cost = 126
```
Note that without this patch, LV calculated 2 Icmp + 1 add but these instruction will not generated in the vector.body.
```
Cost of 1 for VF 4: exit condition instruction %cmp = icmp eq i64 %x.inc, 0
Cost of 1 for VF 4: exit condition instruction %ec = icmp eq i64 %iv.next, %N
Cost of 2 for VF 4: exit condition instruction %x.inc = add i64 %iv.and, %x
```
```
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%vec.phi = phi <8 x i8> [ zeroinitializer, %vector.ph ], [ %14, %vector.body ]
%10 = and i64 %index, 1
%11 = getelementptr i8, ptr %src, i64 %10
%12 = getelementptr i8, ptr %11, i32 0
%13 = getelementptr i8, ptr %12, i32 -7
%wide.load = load <8 x i8>, ptr %13, align 1
%reverse = shufflevector <8 x i8> %wide.load, <8 x i8> poison, <8 x i32> <i32 7, i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
%14 = xor <8 x i8> %reverse, %vec.phi
%index.next = add nuw i64 %index, 8
%15 = icmp eq i64 %index.next, %n.vec
br i1 %15, label %middle.block, label %vector.body, !llvm.loop !0
```
https://github.com/llvm/llvm-project/pull/125640
More information about the llvm-commits
mailing list