[llvm] [RISCV][TTI] Update cost and prevent exceed m8 for vector.extract.last.active (PR #188160)

Tue Mar 31 00:47:51 PDT 2026

================
@@ -1731,14 +1731,20 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
     //   ...
 
     // Find a suitable type for a stepvector.
-    ConstantRange VScaleRange(APInt(64, 1), APInt::getZero(64));
-    unsigned EltWidth = getTLI()->getBitWidthForCttzElements(
-        MaskTy->getScalarType(), MaskTy->getElementCount(),
-        /*ZeroIsPoison=*/true, &VScaleRange);
+    unsigned EltWidth = EVT(getTLI()->getVectorIdxTy(getDataLayout()))
+                            .getTypeForEVT(MaskTy->getContext())
+                            ->getScalarSizeInBits();
----------------
lukel97 wrote:

How come we're no longer calling getBtiWidthForCttzElements? Doesn't that mean we're now always using i64 on rv64 and i32 on rv32? We use e.g. e8 for this small vector in the codegen tests:

```llvm

define i8 @extract_last_i8(<16 x i8> %data, <16 x i8> %mask, i8 %passthru) {
; CHECK-LABEL: extract_last_i8:
; CHECK:       # %bb.0:
; CHECK-NEXT:    vsetivli zero, 16, e8, m1, ta, ma
; CHECK-NEXT:    vmsne.vi v0, v9, 0
; CHECK-NEXT:    vcpop.m a1, v0
; CHECK-NEXT:    beqz a1, .LBB0_2
; CHECK-NEXT:  # %bb.1:
; CHECK-NEXT:    vmv.v.i v9, 0
; CHECK-NEXT:    vsetvli zero, zero, e8, m1, ta, mu
; CHECK-NEXT:    vid.v v9, v0.t
; CHECK-NEXT:    vredmaxu.vs v9, v9, v9
; CHECK-NEXT:    vmv.x.s a0, v9
; CHECK-NEXT:    zext.b a0, a0
; CHECK-NEXT:    vslidedown.vx v8, v8, a0
; CHECK-NEXT:    vmv.x.s a0, v8
; CHECK-NEXT:  .LBB0_2:
; CHECK-NEXT:    ret
  %notzero = icmp ne <16 x i8> %mask, zeroinitializer
  %res = call i8 @llvm.experimental.vector.extract.last.active.v16i8(<16 x i8> %data, <16 x i1> %notzero, i8 %passthru)
  ret i8 %res
}
```

https://github.com/llvm/llvm-project/pull/188160