[llvm] [RISCV][TTI] Reduce cost of a build_vector pattern (PR #108419)

Wed Sep 18 05:21:01 PDT 2024

https://github.com/lukel97 commented:

I'm seeing a 0.88% regression on 511.povray_r on the BPI F3 after applying this, and it's fairly reproducible (< 0.1% stddev). Looking through the codegen changes, it looks like we're now avoiding partial vectorization in some places where we e.g. exploded a vector to do a bunch of exp intrinsic calls, spilling the vector registers exactly as you described. 

I would have thought that avoiding these vector spills would have been the right thing to do, I'm not sure why it's turning out to be slower in the scalar form. Do we need discount build_vectors a bit more to get it to partially vectorize these parts again?

To clarify, I think the changes in this PR are the right thing to do, I just want to point the interaction with SLP. I'm running the other benchmarks now to see if they're also affected.

https://github.com/llvm/llvm-project/pull/108419