[llvm] [RISCV] Allow hoisting VXRM writes out of loops speculatively (PR #110044)

Craig Topper via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 25 15:06:38 PDT 2024


topperc wrote:

> So just to chime in since I suggested Felipe tackle this problem.
> 
> As Felipe noted some designs flush their pipelines on a VXRM write which can be a significant performance issue. I measured a 2-3% improvement with a patch tackling this problem for GCC running spec2017's x264 workloads on the BPI-F3 board. That all comes from speculative hoisting of VXRM assignments needed to use vaaddu to implement the ceiling average found in pixel_avg.

Do you have data for not using vaaddu in that loop and using a vwadd.vv, a vadd.vi, and a vnsrl instead? So that there's no VXRM write needed?

SiFive's p470 and p670 don't optimize VXRM writes well either. Even with the writes hoisted of the loop nest, I couldn't get the vaaddu to be profitable.

https://github.com/llvm/llvm-project/pull/110044


More information about the llvm-commits mailing list