[llvm] [VPlan] Don't use the legacy cost model for loop conditions (PR #156864)
John Brawn via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 6 09:39:34 PST 2026
================
@@ -386,7 +386,7 @@ define i32 @diff_exit_block_needs_scev_check(i32 %end) {
; CHECK-NEXT: [[TMP0:%.*]] = trunc i32 [[END]] to i10
; CHECK-NEXT: [[TMP1:%.*]] = zext i10 [[TMP0]] to i64
; CHECK-NEXT: [[UMAX1:%.*]] = call i64 @llvm.umax.i64(i64 [[TMP1]], i64 1)
-; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[UMAX1]], 8
+; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[UMAX1]], 12
----------------
john-brawn-arm wrote:
Looking at a diff of the ``-debug`` output before and after the last marge it looks like there's two things going on:
```
Cost of 0 for VF 4: EMIT vp<%index.next> = add nuw vp<%4>, vp<%1> Cost of 0 for VF 4: EMIT vp<%index.next> = add nuw vp<%4>, vp<%1>
Cost of 2 for VF 4: EMIT vp<%9> = any-of ir<%cmp3> Cost of 2 for VF 4: EMIT vp<%9> = any-of ir<%cmp3>
Cost of 1 for VF 4: EMIT vp<%10> = icmp eq vp<%index.next>, vp<%2> | Cost of 2 for VF 4: EMIT vp<%10> = icmp eq vp<%index.next>, vp<%2>
Cost of 0 for VF 4: EMIT vp<%11> = or vp<%9>, vp<%10> | Cost of 0 for VF 4: EMIT branch-on-two-conds vp<%9>, vp<%10>
Cost of 0 for VF 4: EMIT branch-on-cond vp<%11> <
```
Cost of the icmp is now 2. It looks like this is because of 524b1788c4f5a5fd4fdda5e51dd1e484b4124870: BranchOnTwoConds isn't listed in VPInstruction::usesFirstLaneOnly which causes vputils::onlyFirstLaneUsed to return false, so VPInstruction::computeCost calculates the cost of a vector compare. I think this can be fixed by adding BranchOnTwoConds to VPInstruction::usesFirstLaneOnly.
```
Calculating cost of work in exit block vector.early.exit: Calculating cost of work in exit block vector.early.exit:
Cost of 12 for VF 4: EMIT vp<%17> = first-active-lane ir<%cmp> | Cost of 12 for VF 4: EMIT vp<%15> = first-active-lane ir<%cmp>
Cost of 0 for VF 4: EMIT vp<%18> = add vp<%4>, vp<%17> | Cost of 0 for VF 4: EMIT vp<%16> = add vp<%4>, vp<%15>
Cost of 0 for VF 4: vp<%19> = DERIVED-IV ir<3> + vp<%18> * ir<1> | Cost of 0 for VF 4: vp<%17> = DERIVED-IV ir<3> + vp<%16> * ir<1>
> Cost of 0 for VF 4: EMIT vp<%12> = extract-last-part ir<%ld2>
> Cost of 2 for VF 4: EMIT vp<%13> = extract-last-lane vp<%12>
Cost of 1 for VF 4: EMIT vp<%cmp.n> = icmp eq ir<64>, vp<%2> Cost of 1 for VF 4: EMIT vp<%cmp.n> = icmp eq ir<64>, vp<%2>
Cost of 0 for VF 4: EMIT branch-on-cond vp<%cmp.n> Cost of 0 for VF 4: EMIT branch-on-cond vp<%cmp.n>
LV: Minimum required TC for runtime checks to be profitable:20 | LV: Minimum required TC for runtime checks to be profitable:24
```
I haven't figured out what's going on here yet, will continue looking at it.
https://github.com/llvm/llvm-project/pull/156864
More information about the llvm-commits
mailing list