[llvm] [RISCV] Allow hoisting VXRM writes out of loops speculatively (PR #110044)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 25 15:22:01 PDT 2024
JeffreyALaw wrote:
> > So just to chime in since I suggested Felipe tackle this problem.
> > As Felipe noted some designs flush their pipelines on a VXRM write which can be a significant performance issue. I measured a 2-3% improvement with a patch tackling this problem for GCC running spec2017's x264 workloads on the BPI-F3 board. That all comes from speculative hoisting of VXRM assignments needed to use vaaddu to implement the ceiling average found in pixel_avg.
>
> Do you have data for not using vaaddu in that loop and using a vwadd.vv, a vadd.vi, and a vnsrl instead? So that there's no VXRM write needed?
>
> SiFive's p470 and p670 don't optimize VXRM writes well either. Even with the writes hoisted of the loop nest, I couldn't get the vaaddu to be profitable.
No. We largely set this aside as we adjusted the Ventana design to make VXRM assignments cheap. But it seemed a shame to leave things performing so badly on the BPI given we had bits to make it "no so bad".
https://github.com/llvm/llvm-project/pull/110044
More information about the llvm-commits
mailing list