[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short
Tobias Grosser
tobias at grosser.es
Fri Feb 3 01:28:26 PST 2012
Hi Hal,
this is one of the first test cases, I would love to have improved
vectorizer support. I sent it out earlier, but I think it is a good time
to look into it again, after the vectorizer was committed.
The basic examples is a set of scalar loads that load for consecutive
elements and store them back right ahead. For me this is an obvious case
where vectorization is beneficial (scalar.ll):
define i32 @main() nounwind {
%V1 = load float* getelementptr ([1024 x float]* @A, i64 0, i64 0),
align 16
%V2 = load float* getelementptr ([1024 x float]* @A, i64 0, i64 1),
align 4
%V3= load float* getelementptr ([1024 x float]* @A, i64 0, i64 2),
align 8
%V4 = load float* getelementptr ([1024 x float]* @A, i64 0, i64 3),
align 4
store float %V1, float* getelementptr ([1024 x float]* @B, i64 0, i64
0), align 16
store float %V2, float* getelementptr ([1024 x float]* @B, i64 0, i64
1), align 4
store float %V3, float* getelementptr ([1024 x float]* @B, i64 0, i64
2), align 8
store float %V4, float* getelementptr ([1024 x float]* @B, i64 0, i64
3), align 4
ret i32 0
}
opt -O3 -vectorize can not optimize this straight ahead, as the
req-chain is too short.
Adding -bb-vectorize-req-chain-depth=2 allows us to vectorize the code:
define i32 @main() nounwind {
%V1 = load <4 x float>* bitcast ([1024 x float]* @A to <4 x float>*),
align 16
store <4 x float> %V1, <4 x float>* bitcast ([1024 x float]* @B to <4
x float>*), align 16
ret i32 0
}
Is there any way, we can make this case work by default? Maybe we can
decrease the req-chain to 2, and increase the cost for non stride one
loads or stores?
Another probably unrelated point. I tried also a run with
-bb-vectorize-req-chain-depth=1. The generated code is full of
shufflevector instructions and eight element vectors. For me this is
entirely unexpected. Do you have any ideas what is going on here?
Tobi
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: scalar.ll
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120203/b797055f/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: vector.ll
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120203/b797055f/attachment-0001.ksh>
More information about the llvm-dev
mailing list