[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions
Eugene Leviant via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 31 05:34:47 PDT 2017
evgeny777 added a comment.
@rengolin
> What ARM machine have you ran these?
This is NDA-covered and I'm not going to share specs. My testing abilities are limited, but here are few more
devices, you should be familiar with:
**Samsung Galaxy Nexus (TI OMAP 4460 Dual-core Cortex-A9 1,2 GHz): **
With patch:
4x3 Matrix multiplication: 7436829 usec
4x4 Matrix multiplication: 8205384 usec
W/o patch:
4x3 Matrix multiplication: 7797547 usec
4x4 Matrix multiplication: 8580291 usec
**Huawei Honor 8 (Cortex A72, 2.3GHz):**
With patch:
4x3 Matrix multiplication: 1020747 usec
4x4 Matrix multiplication: 1130021 usec
W/o patch:
4x3 Matrix multiplication: 1193994 usec
4x4 Matrix multiplication: 1290402 usec
> From @javed.absar's comment, I'm not the only one finding this patch non-intuitive, even with the good amount of comments.
I'm preparing algorithm description, so I suggest returning to discussing this later.
Repository:
rL LLVM
https://reviews.llvm.org/D39415
More information about the llvm-commits
mailing list