[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions

Eugene Leviant via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 31 05:34:47 PDT 2017


evgeny777 added a comment.

@rengolin

> What ARM machine have you ran these?

This is NDA-covered and I'm not going to share specs. My testing abilities are limited, but here are few more
devices, you should be familiar with:

**Samsung Galaxy Nexus (TI OMAP 4460 Dual-core Cortex-A9 1,2 GHz): **
With patch:

  4x3 Matrix multiplication: 7436829 usec
  4x4 Matrix multiplication: 8205384 usec

W/o patch:

  4x3 Matrix multiplication: 7797547 usec
  4x4 Matrix multiplication: 8580291 usec

**Huawei Honor 8 (Cortex A72, 2.3GHz):**
With patch:

  4x3 Matrix multiplication: 1020747 usec
  4x4 Matrix multiplication: 1130021 usec

W/o patch:

  4x3 Matrix multiplication: 1193994 usec
  4x4 Matrix multiplication: 1290402 usec

> From @javed.absar's comment, I'm not the only one finding this patch non-intuitive, even with the good amount of comments.

I'm preparing algorithm description, so I suggest returning to discussing this later.


Repository:
  rL LLVM

https://reviews.llvm.org/D39415





More information about the llvm-commits mailing list