[llvm] [Analysis] Extend llvm.experimental.cttz.elts to type-based-cost (PR #184578)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 6 00:26:05 PST 2026


================
@@ -131,3 +194,65 @@ define void @foo_vscale_range_2_16() vscale_range(2,16) {
 
   ret void
 }
+
+define void @foo_fixed_len_vectors() {
+; CHECK-LABEL: 'foo_fixed_len_vectors'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %res.i32.v2i1.false = call i32 @llvm.experimental.cttz.elts.i32.v2i1(<2 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %res.i32.v4i1.false = call i32 @llvm.experimental.cttz.elts.i32.v4i1(<4 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %res.i32.v8i1.false = call i32 @llvm.experimental.cttz.elts.i32.v8i1(<8 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %res.i32.v64i1.false = call i32 @llvm.experimental.cttz.elts.i32.v64i1(<64 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %res.i32.v128i1.false = call i32 @llvm.experimental.cttz.elts.i32.v128i1(<128 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 663 for instruction: %res.i32.v1024i1.false = call i32 @llvm.experimental.cttz.elts.i32.v1024i1(<1024 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1319 for instruction: %res.i32.v2048i1.false = call i32 @llvm.experimental.cttz.elts.i32.v2048i1(<2048 x i1> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v2i1.true = call i32 @llvm.experimental.cttz.elts.i32.v2i1(<2 x i1> undef, i1 true)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 1319 for instruction: %res.i32.v2048i1.true = call i32 @llvm.experimental.cttz.elts.i32.v2048i1(<2048 x i1> undef, i1 true)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %res.i32.v2i32 = call i32 @llvm.experimental.cttz.elts.i32.v2i32(<2 x i32> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: %res.i32.v4i32 = call i32 @llvm.experimental.cttz.elts.i32.v4i32(<4 x i32> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 171 for instruction: %res.i32.v32i32 = call i32 @llvm.experimental.cttz.elts.i32.v32i32(<32 x i32> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %res.i32.v2i33 = call i32 @llvm.experimental.cttz.elts.i32.v2i33(<2 x i33> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: %res.i32.v4i33 = call i32 @llvm.experimental.cttz.elts.i32.v4i33(<4 x i33> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 170 for instruction: %res.i32.v32i33 = call i32 @llvm.experimental.cttz.elts.i32.v32i33(<32 x i33> undef, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPE-LABEL: 'foo_fixed_len_vectors'
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v2i1.false = call i32 @llvm.experimental.cttz.elts.i32.v2i1(<2 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v4i1.false = call i32 @llvm.experimental.cttz.elts.i32.v4i1(<4 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v8i1.false = call i32 @llvm.experimental.cttz.elts.i32.v8i1(<8 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v64i1.false = call i32 @llvm.experimental.cttz.elts.i32.v64i1(<64 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v128i1.false = call i32 @llvm.experimental.cttz.elts.i32.v128i1(<128 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 663 for instruction: %res.i32.v1024i1.false = call i32 @llvm.experimental.cttz.elts.i32.v1024i1(<1024 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1319 for instruction: %res.i32.v2048i1.false = call i32 @llvm.experimental.cttz.elts.i32.v2048i1(<2048 x i1> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %res.i32.v2i1.true = call i32 @llvm.experimental.cttz.elts.i32.v2i1(<2 x i1> undef, i1 true)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 1319 for instruction: %res.i32.v2048i1.true = call i32 @llvm.experimental.cttz.elts.i32.v2048i1(<2048 x i1> undef, i1 true)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 14 for instruction: %res.i32.v2i32 = call i32 @llvm.experimental.cttz.elts.i32.v2i32(<2 x i32> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: %res.i32.v4i32 = call i32 @llvm.experimental.cttz.elts.i32.v4i32(<4 x i32> undef, i1 false)
+; TYPE-NEXT:  Cost Model: Found an estimated cost of 171 for instruction: %res.i32.v32i32 = call i32 @llvm.experimental.cttz.elts.i32.v32i32(<32 x i32> undef, i1 false)
----------------
lukel97 wrote:

> Yes, for fixed length vectors it tries to get the cost from the target again but with scalar arguments. This crashes in RISCV. For scalable vectors it just returns an InvalidCost.

How come we're scalarizing fixed length vectors to begin with? We can codegen fixed length llvm.experimental.cttz.elts intrinsics. I think we should probably fix that first rather than work around it, because these fixed length costs are inaccurate

https://github.com/llvm/llvm-project/pull/184578


More information about the llvm-commits mailing list