[llvm] [LV] Enable considering higher VFs when data extend ops are present i… (PR #137593)

Sushant Gokhale via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 29 04:24:36 PDT 2025


sushgokh wrote:

> Could you explain what the performance difference was and why it led to improvements? What did the two versions of the assembly look like? Thanks

For the benchmark, the difference is currently, the cost model is selecting `VF=1`. However, we know that with `VF=vscale x 4`, the code is more profitable. The difference is ~14%

A code, which is somewhat similar to the benchmark, can be found here: https://godbolt.org/z/Wfjhb8PPT
This is also getting scalarized but  `vscale x 16`, this is more profitable as measured with llvm-mca.

https://github.com/llvm/llvm-project/pull/137593


More information about the llvm-commits mailing list