[llvm] [RISCV] Fold vector shift of sext/zext to widening multiply (PR #121563)

Fri Jan 3 03:49:22 PST 2025

https://github.com/pfusik commented:

In the absence of Zvbb `vwsll.vi`, it can be profitable to use a widening multiply instead of sign/zero extension followed by a left shift.

This is the case on BPI-F3. Each of `vsext.vf2`, `vsll.vi` and `vwmul[s]u.vx` has 2*LMUL cycles throughtput: https://camel-cdr.github.io/rvv-bench-results/bpi_f3/
I confirmed this transform improves some benchmarks on a BPI-F3 board.

Looking at https://camel-cdr.github.io/rvv-bench-results/canmv_k230/, it should also apply there.
I don't know whether this is profitable for other RVV CPUs. Please advise if this should be restricted to certain CPUs and which ones.

https://github.com/llvm/llvm-project/pull/121563