[PATCH] D39415: [ARMISelLowering] Better handling of NEON load/store for sequential memory regions
Eugene Leviant via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 30 05:42:03 PDT 2017
evgeny777 created this revision.
Herald added subscribers: kristof.beyls, eraman, rengolin, aemerson.
Consider sample code which copies 4x4 matrix row by row (see cascade-vld-vst.ll). Current revision generation following code (AArch32):
mov r2, #48
mov r3, r0
vld1.32 {d16, d17}, [r3], r2
vld1.64 {d18, d19}, [r3]
add r3, r0, #32
add r0, r0, #16
vld1.64 {d22, d23}, [r0]
add r0, r1, #16
vld1.64 {d20, d21}, [r3]
vst1.64 {d22, d23}, [r0]
add r0, r1, #32
vst1.64 {d20, d21}, [r0]
vst1.32 {d16, d17}, [r1], r2
vst1.64 {d18, d19}, [r1]
mov pc, lr
After this patch is applied:
vld1.32 {d16, d17}, [r0]!
vld1.32 {d18, d19}, [r0]!
vld1.32 {d20, d21}, [r0]!
vld1.64 {d22, d23}, [r0]
vst1.32 {d16, d17}, [r1]!
vst1.32 {d18, d19}, [r1]!
vst1.32 {d20, d21}, [r1]!
vst1.64 {d22, d23}, [r1]
mov pc, lr
It also speeds up our matrix multiplication function by 15%. Some of the existing LLVM test cases now have approx. 25% less instructions than before.
The improvement is based on two major changes to CombineBaseUpdate
1. When we select address increment instruction to fold we prefer one which is equal to access size of load/store
2. If we can't find such address increment bound to current load/store instruction address operand we navigate up the SelectionDAG chain and try to borrow address increment instruction bound to address operand of parent VST{X}_UPD and VLD{X}_UPD which we processed earlier.
Repository:
rL LLVM
https://reviews.llvm.org/D39415
Files:
lib/Target/ARM/ARMISelLowering.cpp
test/CodeGen/ARM/alloc-no-stack-realign.ll
test/CodeGen/ARM/cascade-vld-vst.ll
test/CodeGen/ARM/memcpy-inline.ll
test/CodeGen/ARM/misched-fusion-aes.ll
test/CodeGen/ARM/vector-load.ll
test/CodeGen/ARM/vext.ll
test/Transforms/LoopStrengthReduce/ARM/ivchain-ARM.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D39415.120805.patch
Type: text/x-patch
Size: 35190 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171030/1b7c423a/attachment.bin>
More information about the llvm-commits
mailing list