[Patch] [SLPVectorizer] Vectorize Horizontal Reductions from Consecutive Loads
Suyog Kamal Sarda
suyog.sarda at samsung.com
Thu Dec 11 04:38:02 PST 2014
Hi All,
This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads,
and vectorizes it. Earlier as discussed in LLVM mail threads, we didn't vectorize such horizontal reductions.
Test case :
float hadd(float* a) {
return (a[0] + a[1]) + (a[2] + a[3]);
}
AArch64 assembly before patch :
ldp s0, s1, [x0]
ldp s2, s3, [x0, #8]
fadd s0, s0, s1
fadd s1, s2, s3
fadd s0, s0, s1
ret
AArch64 assembly after patch :
ldp d0, d1, [x0]
fadd v0.2s, v0.2s, v1.2s
faddp s0, v0.2s
ret
More work of recognizing (+(+(+ v0, v1) v2) v3) still remains. I will come up with this in another patch.
Please help in reviewing the patch. No 'make-check' failures observed with this patch.
(Would have preferred Phabricator, but its not working and hence sending via e-mail)
Regards,
Suyog
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SLP1.patch
Type: application/octet-stream
Size: 3705 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141211/1109c941/attachment.obj>
More information about the llvm-commits
mailing list