[llvm] [RISCV][TTI] Implement cost of llvm.experimental.vector.extract.last.active (PR #184067)

Thu Mar 5 18:54:29 PST 2026

================
@@ -1702,6 +1702,45 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
                                CmpInst::FCMP_UNO, CostKind);
     return Cost;
   }
+  case Intrinsic::experimental_vector_extract_last_active: {
+    Type *ValTy = ICA.getArgTypes()[0];
+    Type *MaskTy = ICA.getArgTypes()[1];
+
+    auto ValLT = getTypeLegalizationCost(ValTy);
+    auto MaskLT = getTypeLegalizationCost(MaskTy);
+
+    // TODO: Return cheaper cost when the entire lane is inactive.
+    // The expected asm sequence is:
+    // vcpop.m a0, v0
+    // beqz a0, exit # Return passthru when the entire lane is inactive.
+    // vid v10, v0.t
+    // vredmaxu.vs v10, v10, v10
+    // vmv.x.s a0, v10
+    // zext.b a0, a0
+    // vslidedown.vx v8, v8, a0
+    // vmv.x.s a0, v8
+    // exit:
+    //   ...
+    auto *Int8Ty = Type::getInt8Ty(ValTy->getContext());
----------------
lukel97 wrote:

I'm not sure if the type is always guaranteed to be i8. It looks like if the vector is large enough it can use a wider type, this is what the expansion uses:

```
  uint64_t EltWidth = TLI.getBitWidthForCttzElements(
      BoolVT.getTypeForEVT(*DAG.getContext()), MaskVT.getVectorElementCount(),
      /*ZeroIsPoison=*/true, &VScaleRange);
  // If the step vector element type is smaller than the mask element type,
  // use the mask type directly to avoid widening issues.
  EltWidth = std::max(EltWidth, BoolVT.getFixedSizeInBits());
  EVT StepVT = MVT::getIntegerVT(EltWidth);
```

We can probably also use TLI.getBitWidthForCttzElements here too?

https://github.com/llvm/llvm-project/pull/184067