[llvm] [RISCV] Allow hoisting VXRM writes out of loops speculatively (PR #110044)
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 25 15:06:38 PDT 2024
topperc wrote:
> So just to chime in since I suggested Felipe tackle this problem.
>
> As Felipe noted some designs flush their pipelines on a VXRM write which can be a significant performance issue. I measured a 2-3% improvement with a patch tackling this problem for GCC running spec2017's x264 workloads on the BPI-F3 board. That all comes from speculative hoisting of VXRM assignments needed to use vaaddu to implement the ceiling average found in pixel_avg.
Do you have data for not using vaaddu in that loop and using a vwadd.vv, a vadd.vi, and a vnsrl instead? So that there's no VXRM write needed?
SiFive's p470 and p670 don't optimize VXRM writes well either. Even with the writes hoisted of the loop nest, I couldn't get the vaaddu to be profitable.
https://github.com/llvm/llvm-project/pull/110044
More information about the llvm-commits
mailing list