[llvm] [LV] Add a flag to conservatively choose a larger vector factor when maximizing bandwidth (PR #156012)

Fri Sep 5 05:46:07 PDT 2025

ytmukai wrote:

While throughput is a key factor, I believe the fundamental approach should be to model per-pipeline load and account for bottlenecks, much like the instruction scheduler does. I initially considered enhancing the cost model this way, but the overhead of such complexity may not be justified, as the pipeline load balance often remains stable during vectorization. This led me to propose a more pragmatic, lightweight solution as seen in this patch.

> This doesn't solve the problem for interleaving though, where the loop vectoriser may aggressively choose an interleave factor of 4 for neoverse-v1 without taking throughput into account.

For some goals of interleaving, like reducing loop overhead and parallelizing reductions, considering per-pipeline throughput would indeed be beneficial. If it can also solve such problems, perhaps the added complexity would be justified.

https://github.com/llvm/llvm-project/pull/156012