[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions
Eugene Leviant via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 31 01:45:22 PDT 2017
evgeny777 added a comment.
@rengolin
1. Compilation times
With patch:
real 1m40.086s
user 0m57.040s
sys 0m27.444s
W/o patch:
real 1m40.619s
user 0m58.120s
sys 0m25.944s
Those measurements were done with this bash script:
#!/bin/bash
LLC=/data/llvm/build_ninja_Release/bin/llc
for ((i=1;i<=10000;i++)); do
$LLC -mtriple=arm-eabi -float-abi=soft -mattr=+neon mat_mul_4x4.ll
done
An interesting fact is that execution of patched llc is stably slightly less than
that of non-patched version (both were run 3 times in a row). Not sure what the reason
is (may be less number of SD nodes after DAGCombine). My machine specs are:
Core-i5 2500K, 8GB RAM Ubuntu 16.04
2. Execution times of matrix multiplication example (ARMv8, 32-bit) on ARM Cortex A57, 2GHz:
With patch:
MI scheduler: 2549066 usec
SD scheduler: 2647092 usec
W/o patch:
MI scheduler: 3039261 usec
SD scheduler: 2843175 usec
We're using MI scheduler model added in https://reviews.llvm.org/D28152. With SD scheduler improvement is smaller, but still
noticable.
Repository:
rL LLVM
https://reviews.llvm.org/D39415
More information about the llvm-commits
mailing list