[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions

Eugene Leviant via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 30 05:42:03 PDT 2017


evgeny777 created this revision.
Herald added subscribers: kristof.beyls, eraman, rengolin, aemerson.

Consider sample code which copies 4x4 matrix row by row (see cascade-vld-vst.ll). Current revision generation following code (AArch32):

  mov r2, #48
  mov r3, r0
  vld1.32 {d16, d17}, [r3], r2
  vld1.64 {d18, d19}, [r3]
  add r3, r0, #32
  add r0, r0, #16
  vld1.64 {d22, d23}, [r0]
  add r0, r1, #16
  vld1.64 {d20, d21}, [r3]
  vst1.64 {d22, d23}, [r0]
  add r0, r1, #32
  vst1.64 {d20, d21}, [r0]
  vst1.32 {d16, d17}, [r1], r2
  vst1.64 {d18, d19}, [r1]
  mov pc, lr

After this patch is applied:

  vld1.32 {d16, d17}, [r0]!
  vld1.32 {d18, d19}, [r0]!
  vld1.32 {d20, d21}, [r0]!
  vld1.64 {d22, d23}, [r0]
  vst1.32 {d16, d17}, [r1]!
  vst1.32 {d18, d19}, [r1]!
  vst1.32 {d20, d21}, [r1]!
  vst1.64 {d22, d23}, [r1]
  mov pc, lr

It also speeds up our matrix multiplication function by 15%. Some of the existing LLVM test cases now have approx. 25% less instructions than before.

The improvement is based on two major changes to CombineBaseUpdate

1. When we select address increment instruction to fold we prefer one which is equal to access size of load/store
2. If we can't find such address increment bound to current load/store instruction address operand we navigate up the SelectionDAG chain and try to borrow address increment instruction bound to address operand of parent VST{X}_UPD and VLD{X}_UPD which we processed earlier.


Repository:
  rL LLVM

https://reviews.llvm.org/D39415

Files:
  lib/Target/ARM/ARMISelLowering.cpp
  test/CodeGen/ARM/alloc-no-stack-realign.ll
  test/CodeGen/ARM/cascade-vld-vst.ll
  test/CodeGen/ARM/memcpy-inline.ll
  test/CodeGen/ARM/misched-fusion-aes.ll
  test/CodeGen/ARM/vector-load.ll
  test/CodeGen/ARM/vext.ll
  test/Transforms/LoopStrengthReduce/ARM/ivchain-ARM.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D39415.120805.patch
Type: text/x-patch
Size: 35190 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171030/1b7c423a/attachment.bin>


More information about the llvm-commits mailing list