[llvm] [VPlan] Enable vectorization of early-exit loops with unit-stride fault-only-first loads (PR #151300)

Fri Dec 12 00:53:28 PST 2025

================
@@ -1781,6 +1807,14 @@ static InstructionCost getCostForIntrinsics(Intrinsic::ID ID,
 InstructionCost VPWidenIntrinsicRecipe::computeCost(ElementCount VF,
                                                     VPCostContext &Ctx) const {
   SmallVector<const VPValue *> ArgOps(operands());
+  if (VectorIntrinsicID == Intrinsic::vp_load_ff) {
+    auto *StructTy = cast<StructType>(ResultTy);
+    Type *DataTy = toVectorizedTy(StructTy->getStructElementType(0), VF);
+    // TODO: Infer alignment from pointer.
+    Align Alignment;
+    return Ctx.TTI.getMemIntrinsicInstrCost(
+        {VectorIntrinsicID, DataTy, Alignment}, Ctx.CostKind);
+  }
   return getCostForIntrinsics(VectorIntrinsicID, ArgOps, *this, VF, Ctx);
----------------
lukel97 wrote:

I think the generic `Ctx.TTI.getIntrinsicInstrCost(CostAttrs, Ctx.CostKind);` API should call into getMinIntrinsicInstrCost, I didn't think we would need to call it directly. Is it possible to just reuse getCostForIntrinsic for this? Maybe we need to teach it handle struct types?

https://github.com/llvm/llvm-project/pull/151300