[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions
Renato Golin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 3 05:18:37 PDT 2017
rengolin added a comment.
In https://reviews.llvm.org/D39415#914992, @eastig wrote:
> I've got first results of benchmark runs: the LNT test suite + a private benchmark 01. I used the latest patch.
> The configuration is a Juno board Cortex-A57/A53, v8-a, AArch32, Thumb2.
> Options: -O3 -mcpu=cortex-a57 -mthumb -fomit-frame-pointer
> The runs passed without errors.
Thanks Evgeny.
> Improvements:
>
> | MultiSource/Applications/JM/lencod/lencod | 2.57% |
> | Private benchmark 01 | 9.5% |
>
> Regressions:
>
> | SingleSource/Benchmarks/Misc/salsa20 | 3.41% |
Not very convincing numbers. I guess efficient pipelining can make most of the difference wash away.
Maybe older ARM cores, like A8, will get better improvement?
What about compile time? And code size?
https://reviews.llvm.org/D39415
More information about the llvm-commits
mailing list