[llvm] [RISCV] Use vsetvli instead of vlenb in Prologue/Epilogue (PR #113756)

Tue Oct 29 17:48:53 PDT 2024

kito-cheng wrote:

I've considered adding an option or target feature before, but I decided not to include it in the patch in the end. I know that `csrr vlenb` and `vsetvli` have different performance on various microarchitectures. However, since we limited the replacement to the prologue and epilogue, I don't think this will result in significant performance differences. Generally, we still have a few instructions after `vsetvli`, like stack pointer adjustments, so it shouldn't cause too much interference or blocking with other `vsetvli` instructions. Additionally, as the test diffs show, this approach provides a net code size reduction.

So, why do we still try to replace `VLEN * 1` with vsetvli? Two reasons from me: 1) for consistency and 2) because it might be further optimized with `TII->mulImm`.

XiangShan optimized `csrr vlenb`, while Banana Pi and some SiFive cores did not. However, I haven't seen any concrete cases showing that this transformation causes regressions aside from code size reduction.

https://github.com/llvm/llvm-project/pull/113756