[all-commits] [llvm/llvm-project] fa3ec0: [VPlan] Materialize constant vector trip counts be...

Florian Hahn via All-commits all-commits at lists.llvm.org
Sat Jul 26 09:16:59 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fa3ec0c17c48349e6027710d234c83e7bfeaf854
      https://github.com/llvm/llvm-project/commit/fa3ec0c17c48349e6027710d234c83e7bfeaf854
  Author: Florian Hahn <flo at fhahn.com>
  Date:   2025-07-26 (Sat, 26 Jul 2025)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/lib/Transforms/Vectorize/VPlan.cpp
    M llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
    M llvm/lib/Transforms/Vectorize/VPlanTransforms.h
    M llvm/test/Transforms/LoopVectorize/AArch64/blend-costs.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/call-costs.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/check-prof-info.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/deterministic-type-shrinkage.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/drop-poison-generating-flags.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/fminimumnum.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/force-target-instruction-cost.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/induction-costs.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/invariant-replicate-region.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/licm-calls.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/low_trip_count_predicates.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/mul-simplification.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/optsize_minsize.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product-epilogue.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product-mixed.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product-neon.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-no-dotprod.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/synthesize-mask-for-call.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-constant-ops.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-derived-ivs.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-metadata.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-remove-loop-region.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-unroll.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-with-wide-ops.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-insertelt.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/wider-VF-for-callinst.ll
    M llvm/test/Transforms/LoopVectorize/AMDGPU/packed-math.ll
    M llvm/test/Transforms/LoopVectorize/ARM/mve-reduction-predselect.ll
    M llvm/test/Transforms/LoopVectorize/ARM/optsize_minsize.ll
    M llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll
    M llvm/test/Transforms/LoopVectorize/ARM/tail-folding-not-allowed.ll
    M llvm/test/Transforms/LoopVectorize/LoongArch/defaults.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/low-trip-count.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/partial-reduce-dot-product.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/safe-dep-distance.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/short-trip-count.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-safe-dep-distance.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/vf-will-not-generate-any-vector-insts.ll
    M llvm/test/Transforms/LoopVectorize/SystemZ/addressing.ll
    M llvm/test/Transforms/LoopVectorize/SystemZ/scalar-steps-with-users-demanding-all-lanes-and-first-lane-only.ll
    M llvm/test/Transforms/LoopVectorize/X86/constant-fold.ll
    M llvm/test/Transforms/LoopVectorize/X86/cost-constant-known-via-scev.ll
    M llvm/test/Transforms/LoopVectorize/X86/cost-model.ll
    M llvm/test/Transforms/LoopVectorize/X86/fixed-order-recurrence.ll
    M llvm/test/Transforms/LoopVectorize/X86/fminimumnum.ll
    M llvm/test/Transforms/LoopVectorize/X86/gep-use-outside-loop.ll
    M llvm/test/Transforms/LoopVectorize/X86/imprecise-through-phis.ll
    M llvm/test/Transforms/LoopVectorize/X86/induction-costs.ll
    M llvm/test/Transforms/LoopVectorize/X86/interleave-cost.ll
    M llvm/test/Transforms/LoopVectorize/X86/interleave-ptradd-with-replicated-operand.ll
    M llvm/test/Transforms/LoopVectorize/X86/interleaving.ll
    M llvm/test/Transforms/LoopVectorize/X86/limit-vf-by-tripcount.ll
    M llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll
    M llvm/test/Transforms/LoopVectorize/X86/masked-store-cost.ll
    M llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
    M llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll
    M llvm/test/Transforms/LoopVectorize/X86/optsize.ll
    M llvm/test/Transforms/LoopVectorize/X86/parallel-loops.ll
    M llvm/test/Transforms/LoopVectorize/X86/pr109581-unused-blend.ll
    M llvm/test/Transforms/LoopVectorize/X86/pr131359-dead-for-splice.ll
    M llvm/test/Transforms/LoopVectorize/X86/pr141968-instsimplifyfolder.ll
    M llvm/test/Transforms/LoopVectorize/X86/pr34438.ll
    M llvm/test/Transforms/LoopVectorize/X86/pr36524.ll
    M llvm/test/Transforms/LoopVectorize/X86/pr51366-sunk-instruction-used-outside-of-loop.ll
    M llvm/test/Transforms/LoopVectorize/X86/reduction-fastmath.ll
    M llvm/test/Transforms/LoopVectorize/X86/replicate-recipe-with-only-first-lane-used.ll
    M llvm/test/Transforms/LoopVectorize/X86/replicate-uniform-call.ll
    M llvm/test/Transforms/LoopVectorize/X86/small-size.ll
    M llvm/test/Transforms/LoopVectorize/X86/strided_load_cost.ll
    M llvm/test/Transforms/LoopVectorize/X86/uniform_mem_op.ll
    M llvm/test/Transforms/LoopVectorize/X86/vect.omp.force.small-tc.ll
    M llvm/test/Transforms/LoopVectorize/X86/vectorize-interleaved-accesses-gap.ll
    M llvm/test/Transforms/LoopVectorize/X86/widened-value-used-as-scalar-and-first-lane.ll
    M llvm/test/Transforms/LoopVectorize/X86/x86-predication.ll
    M llvm/test/Transforms/LoopVectorize/blend-in-header.ll
    M llvm/test/Transforms/LoopVectorize/bsd_regex.ll
    M llvm/test/Transforms/LoopVectorize/check-prof-info.ll
    M llvm/test/Transforms/LoopVectorize/constantfolder-infer-correct-gepty.ll
    M llvm/test/Transforms/LoopVectorize/constantfolder.ll
    M llvm/test/Transforms/LoopVectorize/create-induction-resume.ll
    M llvm/test/Transforms/LoopVectorize/dead_instructions.ll
    M llvm/test/Transforms/LoopVectorize/debugloc-optimize-vfuf-term.ll
    M llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-constant-size.ll
    M llvm/test/Transforms/LoopVectorize/dont-fold-tail-for-const-TC.ll
    M llvm/test/Transforms/LoopVectorize/expand-scev-after-invoke.ll
    M llvm/test/Transforms/LoopVectorize/extract-from-end-vector-constant.ll
    M llvm/test/Transforms/LoopVectorize/first-order-recurrence-chains.ll
    M llvm/test/Transforms/LoopVectorize/first-order-recurrence-complex.ll
    M llvm/test/Transforms/LoopVectorize/first-order-recurrence-dead-instructions.ll
    M llvm/test/Transforms/LoopVectorize/first-order-recurrence-interleave-only.ll
    M llvm/test/Transforms/LoopVectorize/first-order-recurrence-multiply-recurrences.ll
    M llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
    M llvm/test/Transforms/LoopVectorize/float-induction.ll
    M llvm/test/Transforms/LoopVectorize/float-minmax-instruction-flag.ll
    M llvm/test/Transforms/LoopVectorize/forked-pointers.ll
    M llvm/test/Transforms/LoopVectorize/if-pred-non-void.ll
    M llvm/test/Transforms/LoopVectorize/if-pred-stores.ll
    M llvm/test/Transforms/LoopVectorize/induction-multiple-uses-in-same-instruction.ll
    M llvm/test/Transforms/LoopVectorize/induction-step.ll
    M llvm/test/Transforms/LoopVectorize/induction.ll
    M llvm/test/Transforms/LoopVectorize/instruction-only-used-outside-of-loop.ll
    M llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll
    M llvm/test/Transforms/LoopVectorize/interleave-with-i65-induction.ll
    M llvm/test/Transforms/LoopVectorize/interleaved-accesses-different-insert-position.ll
    M llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll
    M llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
    M llvm/test/Transforms/LoopVectorize/is_fpclass.ll
    M llvm/test/Transforms/LoopVectorize/iv-select-cmp-decreasing.ll
    M llvm/test/Transforms/LoopVectorize/iv-select-cmp-trunc.ll
    M llvm/test/Transforms/LoopVectorize/iv-select-cmp.ll
    M llvm/test/Transforms/LoopVectorize/iv_outside_user.ll
    M llvm/test/Transforms/LoopVectorize/lcssa-crashes.ll
    M llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll
    M llvm/test/Transforms/LoopVectorize/load-deref-pred-neg-off.ll
    M llvm/test/Transforms/LoopVectorize/load-deref-pred-poison-ub-ops-feeding-pointer.ll
    M llvm/test/Transforms/LoopVectorize/load-of-struct-deref-pred.ll
    M llvm/test/Transforms/LoopVectorize/make-followup-loop-id.ll
    M llvm/test/Transforms/LoopVectorize/metadata.ll
    M llvm/test/Transforms/LoopVectorize/minimumnum-maximumnum-reductions.ll
    M llvm/test/Transforms/LoopVectorize/multiple-address-spaces.ll
    M llvm/test/Transforms/LoopVectorize/narrow-to-single-scalar.ll
    M llvm/test/Transforms/LoopVectorize/no_outside_user.ll
    M llvm/test/Transforms/LoopVectorize/optsize.ll
    M llvm/test/Transforms/LoopVectorize/phi-cost.ll
    M llvm/test/Transforms/LoopVectorize/pointer-induction-index-width-smaller-than-iv-width.ll
    M llvm/test/Transforms/LoopVectorize/pointer-induction-unroll.ll
    M llvm/test/Transforms/LoopVectorize/pointer-induction.ll
    M llvm/test/Transforms/LoopVectorize/pr39417-optsize-scevchecks.ll
    M llvm/test/Transforms/LoopVectorize/pr44488-predication.ll
    M llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll
    M llvm/test/Transforms/LoopVectorize/pr47343-expander-lcssa-after-cfg-update.ll
    M llvm/test/Transforms/LoopVectorize/pr50686.ll
    M llvm/test/Transforms/LoopVectorize/pr55167-fold-tail-live-out.ll
    M llvm/test/Transforms/LoopVectorize/pr58811-scev-expansion.ll
    M llvm/test/Transforms/LoopVectorize/pr66616.ll
    M llvm/test/Transforms/LoopVectorize/predicate-switch.ll
    M llvm/test/Transforms/LoopVectorize/reduction-inloop-min-max.ll
    M llvm/test/Transforms/LoopVectorize/reduction-inloop-pred.ll
    M llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll
    M llvm/test/Transforms/LoopVectorize/reduction-inloop.ll
    M llvm/test/Transforms/LoopVectorize/reduction-with-invariant-store.ll
    M llvm/test/Transforms/LoopVectorize/reduction.ll
    M llvm/test/Transforms/LoopVectorize/remarks-reduction-inloop.ll
    M llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll
    M llvm/test/Transforms/LoopVectorize/reverse_induction.ll
    M llvm/test/Transforms/LoopVectorize/runtime-check.ll
    M llvm/test/Transforms/LoopVectorize/runtime-checks-difference-simplifications.ll
    M llvm/test/Transforms/LoopVectorize/runtime-checks-hoist.ll
    M llvm/test/Transforms/LoopVectorize/scev-exit-phi-invalidation.ll
    M llvm/test/Transforms/LoopVectorize/scev-predicate-reasoning.ll
    M llvm/test/Transforms/LoopVectorize/select-neg-cond.ll
    M llvm/test/Transforms/LoopVectorize/select-reduction-start-value-may-be-undef-or-poison.ll
    M llvm/test/Transforms/LoopVectorize/single-early-exit-interleave-hint.ll
    M llvm/test/Transforms/LoopVectorize/single-early-exit-interleave.ll
    M llvm/test/Transforms/LoopVectorize/single-value-blend-phis.ll
    M llvm/test/Transforms/LoopVectorize/single_early_exit.ll
    M llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll
    M llvm/test/Transforms/LoopVectorize/single_early_exit_with_outer_loop.ll
    M llvm/test/Transforms/LoopVectorize/strided-accesses-interleave-only.ll
    M llvm/test/Transforms/LoopVectorize/tail-folding-optimize-vector-induction-width.ll
    M llvm/test/Transforms/LoopVectorize/trunc-extended-icmps.ll
    M llvm/test/Transforms/LoopVectorize/trunc-loads-p16.ll
    M llvm/test/Transforms/LoopVectorize/trunc-reductions.ll
    M llvm/test/Transforms/LoopVectorize/trunc-shifts.ll
    M llvm/test/Transforms/LoopVectorize/uitofp-preserve-nneg.ll
    M llvm/test/Transforms/LoopVectorize/uniform-blend.ll
    M llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1.ll
    M llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_and.ll
    M llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_div_urem.ll
    M llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_lshr.ll
    M llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll
    M llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
    M llvm/test/Transforms/LoopVectorize/vector-loop-backedge-elimination-early-exit.ll
    M llvm/test/Transforms/LoopVectorize/vector-loop-backedge-elimination-outside-iv-users.ll
    M llvm/test/Transforms/LoopVectorize/version-stride-with-integer-casts.ll
    M llvm/test/Transforms/LoopVectorize/widen-gep-all-indices-invariant.ll
    M llvm/test/Transforms/LoopVectorize/widen-intrinsic.ll

  Log Message:
  -----------
  [VPlan] Materialize constant vector trip counts before final opts. (#142309)

Materialize constant vector trip counts before ::execute, if the trip
count can be computed as Original (TC / (VF * UF)) * (VF * UF). For now
this excludes when the tail is folded or scalar epilogues are required.

This enables removing a number of redundant branches from the middle
block.

For now this is also only done when not vectorizing the epilogue, as the
simplification complicates stitching the 2 plans together.

PR: https://github.com/llvm/llvm-project/pull/142309



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list