[PATCH] [SLPVectorization] Vectorize flat addition in a single tree (+(+(+ v1 v2) v3) v4)
suyog
suyog.sarda at samsung.com
Wed Dec 31 02:53:46 PST 2014
Hi nadav, aschwaighofer, mzolotukhin, jmolloy,
This is one more patch based on previous discussions.
This patch vectorizes flat addition of integer type from a single array whose
expression tree is of type (+(+(+ v1 v2) v3) v4).
e.g.
int foo (int *a) {
return a[0] + a[1] + a[2] + a[3];
}
The IR for above code is :
define i32 @hadd(i32* %a) {
entry:
%0 = load i32* %a, align 4
%arrayidx1 = getelementptr inbounds i32* %a, i32 1
%1 = load i32* %arrayidx1, align 4
%add = add nsw i32 %0, %1
%arrayidx2 = getelementptr inbounds i32* %a, i32 2
%2 = load i32* %arrayidx2, align 4
%add3 = add nsw i32 %add, %2
%arrayidx4 = getelementptr inbounds i32* %a, i32 3
%3 = load i32* %arrayidx4, align 4
%add5 = add nsw i32 %add3, %3
ret i32 %add5
}
The above addition can be modeled as combination of two shuffle vectors, two vector adds and an extractelement instruction.
After vectorization with this patch IR :
define i32 @hadd(i32* %a) {
entry:
%0 = bitcast i32* %a to <4 x i32>*
%1 = load <4 x i32>* %0, align 4
%rdx.shuf = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
%bin.rdx = add <4 x i32> %1, %rdx.shuf
%rdx.shuf1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
%bin.rdx2 = add <4 x i32> %bin.rdx, %rdx.shuf1
%2 = extractelement <4 x i32> %bin.rdx2, i32 0
ret i32 %2
}
AArch assembly before patch :
ldp w8, w9, [x0]
ldp w10, w11, [x0, #8]
add w8, w8, w9
add w8, w8, w10
add w0, w8, w11
ret
AArch assembly after this patch:
ldr q0, [x0]
ext v1.16b, v0.16b, v0.16b, #8
add v0.4s, v0.4s, v1.4s
dup v1.4s, v0.s[1]
add v0.4s, v0.4s, v1.4s
fmov w0, s0
ret
This patch handles any number of such addition like a[0]-a[7]. Added test case for same.
I have written a newfunction "matchFlatReduction" to identify this type of tree as i didn't want to disturb the original "matchAssociateReduction".
Please help in reviewing this patch. No make-check regressions observed.
Regards,
Suyog
REPOSITORY
rL LLVM
http://reviews.llvm.org/D6818
Files:
lib/Transforms/Vectorize/SLPVectorizer.cpp
test/Transforms/SLPVectorizer/AArch64/flatadd.ll
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6818.17744.patch
Type: text/x-patch
Size: 9446 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141231/41ae8693/attachment.bin>
More information about the llvm-commits
mailing list