[llvm] [LV] Add a flag to conservatively choose a larger vector factor when maximizing bandwidth (PR #156012)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 4 06:58:15 PDT 2025
david-arm wrote:
On neoverse-v2 it looks like the throughput for the loads is 3, 4 for the fadd and 2 for the fcvtzs and scvtf. I wonder if the legalisation cost should take the throughput into account? This is just a thought, but perhaps you could achieve the same result by setting the cost a lot higher if legalisation requires more instructions than the throughput? Perhaps the current model of just multiplying the required number of instructions by a single instruction cost is too simple?
This doesn't solve the problem for interleaving though, where the loop vectoriser may aggressively choose an interleave factor of 4 for neoverse-v1 without taking throughput into account.
https://github.com/llvm/llvm-project/pull/156012
More information about the llvm-commits
mailing list