[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions
Renato Golin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 31 02:00:58 PDT 2017
rengolin added a comment.
In https://reviews.llvm.org/D39415#911554, @evgeny777 wrote:
> With patch:
>
> real 1m40.086s
>
>
> W/o patch:
>
> real 1m40.619s
>
@evgeny777, while this is encouraging, you're still only running the one case where you know this is good.
> Those measurements were done with this bash script:
>
> #!/bin/bash
> LLC=/data/llvm/build_ninja_Release/bin/llc
> for ((i=1;i<=10000;i++)); do
> $LLC -mtriple=arm-eabi -float-abi=soft -mattr=+neon mat_mul_4x4.ll
> done
This is very far from ideal, as kernel caching, bash and whatever running on your current system could be affecting your timing more than the difference itself.
> An interesting fact is that execution of patched llc is stably slightly less than
> that of non-patched version (both were run 3 times in a row). Not sure what the reason
> is (may be less number of SD nodes after DAGCombine). My machine specs are:
> Core-i5 2500K, 8GB RAM Ubuntu 16.04
Honestly, I don't think your results carry any statistical significance, the variability of the runs could be due to too many factors. I wouldn't worry about those fluctuations right now.
> With patch:
>
> MI scheduler: 2549066 usec
> SD scheduler: 2647092 usec
>
>
> W/o patch:
>
> MI scheduler: 3039261 usec
> SD scheduler: 2843175 usec
>
>
> We're using MI scheduler model added in https://reviews.llvm.org/D28152. With SD scheduler improvement is smaller, but still
> noticable.
Again, this is the case that was tailored and reduced, so of course, the numbers will look very inflated.
What ARM machine have you ran these? This makes a huge difference. From your comments, I'm guessing this is ARMv8, which is by far *not* the common case for AArch32.
Until you can show results on real benchmarks on at least one core for both compile and run times, there isn't much else we can do.
Once that's available, I'd encourage other people to do the same on their cores, as well as have a deeper look into the code.
>From @javed.absar's comment, I'm not the only one finding this patch non-intuitive, even with the good amount of comments.
cheers,
--renato
Repository:
rL LLVM
https://reviews.llvm.org/D39415
More information about the llvm-commits
mailing list