[PATCH] D56082: [X86][SLP] Enable SLP vectorization for 128-bit horizontal X86 instructions (add, sub)
Anton Afanasyev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 27 13:42:29 PST 2018
anton-afanasyev added a comment.
In D56082#1341374 <https://reviews.llvm.org/D56082#1341374>, @craig.topper wrote:
> I'm concerned about integer types. Without -x86-experimental-vector-widening-legalization we end up promoting v2i32 to v2i64 during type legalization. An X86 specific DAG combine turns some v2i64 operations back to v4i32 based on the result being truncated, but it isn't always able to rearrange the shuffles well.
>
> Changing semi-vec-reg-128bit.ll to use i32 instead of float results in this code instead of phaddd. Even with -mcpu=btver2 which is needed to generate haddps for the float type for this test.
>
> vpshufd $245, %xmm0, %xmm1 # xmm1 = xmm0[1,1,3,3]
> vpaddd %xmm1, %xmm0, %xmm0
> vpshufd $232, %xmm0, %xmm0 # xmm0 = xmm0[0,2,2,3]
> vmovq %xmm0, (%rdi)
>
That is not related to this patch, since it is doing the same thing for either `float` or `i32`:
> ~/llvm/build_rel_exp/bin/opt -S -mcpu=btver2 -slp-vectorizer -instcombine semi-vec-reg-128bit-i32.ll
...
define void @add_pairs_128(<4 x i32>, i32* nocapture) #0 {
%3 = shufflevector <4 x i32> %0, <4 x i32> undef, <2 x i32> <i32 0, i32 2>
%4 = shufflevector <4 x i32> %0, <4 x i32> undef, <2 x i32> <i32 1, i32 3>
%5 = add <2 x i32> %3, %4
%6 = bitcast i32* %1 to <2 x i32>*
store <2 x i32> %5, <2 x i32>* %6, align 4
ret void
}
attributes #0 = { nounwind "target-cpu"="btver2" }
An issue here is with x86 ISel, I believe it should be fixed there (does -x86-experimental-vector-widening-legalization fix it?). Another candidate could be InstCombiner to make specific combination of exctracts and inserts gotten from vectorizer.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56082/new/
https://reviews.llvm.org/D56082
More information about the llvm-commits
mailing list