[llvm] [InstCombine] Fold @llvm.experimental.get.vector.length when cnt <= max_lanes (PR #169293)

Tue Nov 25 02:18:21 PST 2025

================
@@ -0,0 +1,80 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -passes=instcombine,verify -S | FileCheck %s
+
+define i32 @cnt_known_lt() {
+; CHECK-LABEL: define i32 @cnt_known_lt() {
+; CHECK-NEXT:    ret i32 1
+;
+  %x = call i32 @llvm.experimental.get.vector.length(i32 1, i32 2, i1 false)
+  ret i32 %x
+}
+
+define i32 @cnt_not_known_lt() {
+; CHECK-LABEL: define i32 @cnt_not_known_lt() {
+; CHECK-NEXT:    [[X:%.*]] = call i32 @llvm.experimental.get.vector.length.i32(i32 2, i32 1, i1 false)
+; CHECK-NEXT:    ret i32 [[X]]
+;
+  %x = call i32 @llvm.experimental.get.vector.length(i32 2, i32 1, i1 false)
+  ret i32 %x
+}
+
+define i32 @cnt_known_lt_scalable() vscale_range(2, 4) {
+; CHECK-LABEL: define i32 @cnt_known_lt_scalable(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    ret i32 2
+;
+  %x = call i32 @llvm.experimental.get.vector.length(i32 2, i32 1, i1 true)
+  ret i32 %x
+}
+
+define i32 @cnt_not_known_lt_scalable() {
+; CHECK-LABEL: define i32 @cnt_not_known_lt_scalable() {
+; CHECK-NEXT:    [[X:%.*]] = call i32 @llvm.experimental.get.vector.length.i32(i32 2, i32 1, i1 true)
+; CHECK-NEXT:    ret i32 [[X]]
+;
+  %x = call i32 @llvm.experimental.get.vector.length(i32 2, i32 1, i1 true)
+  ret i32 %x
+}
+
+define i32 @cnt_known_lt_runtime(i32 %x) {
+; CHECK-LABEL: define i32 @cnt_known_lt_runtime(
+; CHECK-SAME: i32 [[X:%.*]]) {
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp ult i32 [[X]], 4
+; CHECK-NEXT:    call void @llvm.assume(i1 [[ICMP]])
+; CHECK-NEXT:    ret i32 [[X]]
+;
+  %icmp = icmp ule i32 %x, 3
+  call void @llvm.assume(i1 %icmp)
+  %y = call i32 @llvm.experimental.get.vector.length(i32 %x, i32 3, i1 false)
+  ret i32 %y
+}
+
+define i32 @cnt_known_lt_runtime_trunc(i64 %x) {
+; CHECK-LABEL: define i32 @cnt_known_lt_runtime_trunc(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT:    [[ICMP:%.*]] = icmp ult i64 [[X]], 4
+; CHECK-NEXT:    call void @llvm.assume(i1 [[ICMP]])
+; CHECK-NEXT:    [[Y:%.*]] = trunc nuw nsw i64 [[X]] to i32
+; CHECK-NEXT:    ret i32 [[Y]]
+;
+  %icmp = icmp ule i64 %x, 3
+  call void @llvm.assume(i1 %icmp)
+  %y = call i32 @llvm.experimental.get.vector.length(i64 %x, i32 3, i1 false)
+  ret i32 %y
+}
+
+; FIXME: We should be able to deduce the constant range from AssumptionCache
----------------
lukel97 wrote:

> The get.vector.length being in the recurrence makes it no longer "simple", doesn't it?

matchSimpleRecurrence doesn't care what the step is so it can still handle get.vector.length in the recurrence. The only issue is that we currently don't deduce anything apart from the low zero bits in computeKnownBitsFromOperator. 

> From what I can tell that is correct, I didn't see anywhere in computeKnownBits where get.vector.length is handled currently

Yes, we should separately add support for that too. But for the reduced example in https://github.com/llvm/llvm-project/issues/164762#issuecomment-3567460074 I think we only need to teach computeKnownBitsFromOperator that sub recurrences with nuw should only decrease, so the number of leading zeros should increase. I gave this a super quick go and it seems to work, haven't checked if its generally correct yet:

```diff

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index dbceb8e55784..c2ba85d4bfac 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1857,6 +1857,15 @@ static void computeKnownBitsFromOperator(const Operator *I,
                                        Known3.countMinTrailingZeros()));
 
         auto *OverflowOp = dyn_cast<OverflowingBinaryOperator>(BO);
+
+        if (!OverflowOp)
+          break;
+
+        if (Opcode == Instruction::Sub && Q.IIQ.hasNoUnsignedWrap(OverflowOp)) {
+          Known.Zero.setHighBits(std::max(Known2.countMinLeadingZeros(),
+                                          Known3.countMinLeadingZeros()));
+        }
+
```

That line above in combination with this patch is enough to flatten the loop. 

https://github.com/llvm/llvm-project/pull/169293