[Patch] [SLPVectorizer] Vectorize Horizontal Reductions from Consecutive Loads

Thu Dec 11 04:38:02 PST 2014

Hi All,

This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads,
and vectorizes it. Earlier as discussed in LLVM mail threads, we didn't vectorize such horizontal reductions.

Test case :

       float hadd(float* a) {
           return (a[0] + a[1]) + (a[2] + a[3]);
        }

AArch64 assembly before patch :

                 ldp	s0, s1, [x0]
	ldp	s2, s3, [x0, #8]
	fadd	s0, s0, s1
	fadd	s1, s2, s3
	fadd	s0, s0, s1
	ret

AArch64 assembly after patch :

                 ldp	d0, d1, [x0]
	fadd	v0.2s, v0.2s, v1.2s
	faddp	s0, v0.2s
	ret

More work of recognizing (+(+(+ v0, v1) v2) v3) still remains. I will come up with this in another patch.

Please help in reviewing the patch. No 'make-check' failures observed with this patch.

(Would have preferred Phabricator, but its not working and hence sending via e-mail)

Regards,
Suyog 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SLP1.patch
Type: application/octet-stream
Size: 3705 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141211/1109c941/attachment.obj>