[llvm] [AArch64][SVE] Enable max vector bandwidth for SVE (PR #109671)

Wed Aug 6 05:08:09 PDT 2025

huntergr-arm wrote:

Hi @ytmukai,

Thanks for investigating and writing it up. We came to much the same conclusions about unpacking, and have been slowly improving the cost modeling over the last year.

Much of that work has been focused on operations like sdot and udot, where we can do the extensions as part of the work and avoid the unpacks entirely. In other cases we've just had to make things more expensive where we think it will likely generate unpacks.

I think we found ways to address all the regressions found on neoverse-v2, at least for the limited set of benchmarks we run. The last of those improvements are related to using VPExpressionRecipes to help cost VPPartialReductions along with the associated extends and any intermediate binary operation (e.g. #147255)

Hopefully we'll be in a position to try re-enabling max bandwidth for SVE soon, at least on some cores. I'd like to have it on by default for several months to find remaining edge cases in the cost model before the release of LLVM 22 next year.

https://github.com/llvm/llvm-project/pull/109671