[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions
Renato Golin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 3 08:26:50 PDT 2017
rengolin added a comment.
In https://reviews.llvm.org/D39415#915151, @eastig wrote:
> Actually it had but not many.
> There are some other regressions/improvements. I have not listed them because the hottest code is not changed.
Right, and the fact that TSVC appears on both improvements and regressions is a hint that there are other factors at play.
> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt has 1.86% execution time regression and 41.88% code size improvement.
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt has 1.48% execution time improvement and 39.61% code size improvement.
> MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt has 1.26% execution time improvements and 41.61% code size improvement.
That's pretty consistent, again, probably the same code. But at least you didn't have regressions in non-affected code, which means the early exits are working as expected.
> What about running on Cortex-A53? Its pipeline is in-order.
I don't think out vs. in will make a big difference (maybe just different noise). Because this is an ALU vs. Load/Vector, the pipelining will have a much bigger impact than the dispatcher.
https://reviews.llvm.org/D39415
More information about the llvm-commits
mailing list