[LLVMdev] Horizontal ADD across single vector not profitable in SLP Vectorization
suyog sarda
sardask01 at gmail.com
Fri Nov 28 10:56:24 PST 2014
Hi all,
Following Analysis is regarding horizontal add across single vector.
Test case for AARCH64:
#include <arm_neon.h>
unsigned hadd(uint32x4_t a) {
return a[0] + a[1] + a[2] + a[3];
}
Currently, we emit scalar instructions for above code.
IR for above code will involve -
4 'extractelement' - to extract elements from vector 'a'.
3 'adds' - to perform add
1 return statement.
Lets say, we somehow vectorize this kind of code.
The IR will probably have something like :
1. Extract a[0] and put it in vec1 <2 x i32>, 0
2. Extract a[1] and put it in vec1 <2 x i32>, 1
2. Extract a[2] and put it in vec2 <2 x i32>, 0
3. Extract a[3] and put it in vec2 <2 x i32>, 1
4. Add vec1 and vec2, sum in vec3 <2 x i32>
5. Extract vec3[0] in sum1
6. Extract vec3[1] in sum2
7 add sum1 and sum2 in sum3
8. return sum3
So overall instructions - 6 'extractlement', 4 'insertelement', 1 vector
add, 1 scalar add and 1 return statement. We have vectorized add operation.
This indicates code getting worse than its scalar form (if i am not missing
something).
This was related to PR 20035, where it was advised to handle add across
single vector in SLP vectorizer.
If my analysis is correct, we can never have a more profitable horizontal
add across a single vector in vectorized form (Unless if i am missing
something, perhaps may be 'insertelement and extractelement can be bundled
together in single instruction', not sure on this).
As there is an ARM vector instruction available - ADDV.4S for addition
across a sinle vector and if such code cannot be made profitable by
vectorizing it in SLP, isn't it better to handle in SelectionDAG phase?
Please correct me if i am wrong and suggest better form of vectorized IR.
Suggestions/Comments/Corrections are most awaited !!
--
With regards,
Suyog Sarda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141129/649d420b/attachment.html>
More information about the llvm-dev
mailing list