[PATCH] [AArch64] Inline memcpy() as a sequence of ldp-stp with 64-bit registers

Sergey Dmitrouk sdmitrouk at accesssoftek.com
Fri Oct 31 07:25:14 PDT 2014


Hi jmolloy, t.p.northover, Jiangning,

Here is the result of my tries to make `memcpy()` inlined in an "optimal" way, which means interleaved load/store pair instructions that use 64-bit registers.

It was suggested to make this in `AArch64LoadStoreOptimizer` pass, which did work until PostRA Machine Instruction Scheduler was enabled for AArch64 target, hence it became a separate pass that runs after PostRA MISched. The pass is disabled by default, but changes in tests make them pass with and without the pass.

When ldr/str is in the middle they are reordered as well except for cases like:
```
ldr
ldp
stp
str
```
which occur only on copying small amount of data and I'm not sure if its worth reordering them to
```
ldr
str
ldp
stp
```
but that can be done.

Unfortunately, I don't have AArch64 hardware to run performance test yet so I can't back it up with numbers, but such sequence was claimed to be preferred. At least this gives a way to test it. Or it can just be here for now.

http://reviews.llvm.org/D6054

Files:
  lib/Target/AArch64/AArch64.h
  lib/Target/AArch64/AArch64ISelLowering.cpp
  lib/Target/AArch64/AArch64LoadStoreInterleave.cpp
  lib/Target/AArch64/AArch64TargetMachine.cpp
  lib/Target/AArch64/CMakeLists.txt
  test/CodeGen/AArch64/arm64-variadic-aapcs.ll
  test/CodeGen/AArch64/arm64-virtual_base.ll
  test/CodeGen/AArch64/func-calls.ll
  test/CodeGen/AArch64/memcpy-f128.ll
  test/CodeGen/AArch64/optimal-load-store-pairs.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6054.15614.patch
Type: text/x-patch
Size: 20658 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141031/0b789ab3/attachment.bin>


More information about the llvm-commits mailing list