[PATCH] D60852: Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 18 07:45:26 PDT 2019
spatel added a comment.
In D60852#1471661 <https://reviews.llvm.org/D60852#1471661>, @Serge_Preis wrote:
> > If we are getting this right sometimes, then we might already have the transform that we want, but it is limited in some way that prevents getting the larger case.
> > I doubt that the loop itself is needed to demonstrate the problem because I see 'movd' codegen even with a loop as long as it is not unrolled.
>
> After more experimentation I tend to agree. Also the most basic case produces pinsrd even in a small kernel (https://gcc.godbolt.org/z/HAmNha), so will just create test out of it.
Not sure if this is minimal, but this seems to show the problem:
define <2 x i64> @pinsr(i32 %x, i32 %y) {
%ins1 = insertelement <4 x i32> <i32 undef, i32 0, i32 undef, i32 undef>, i32 %x, i32 0
%ins2 = insertelement <4 x i32> <i32 undef, i32 0, i32 undef, i32 undef>, i32 %y, i32 0
%b1 = bitcast <4 x i32> %ins1 to <2 x i64>
%b2 = bitcast <4 x i32> %ins2 to <2 x i64>
%r = shufflevector <2 x i64> %b1, <2 x i64> %b2, <2 x i32> <i32 0, i32 2>
ret <2 x i64> %r
}
$ llc -o - pinsr.ll -mattr=sse4.2
pxor %xmm1, %xmm1
pxor %xmm0, %xmm0
pinsrd $0, %edi, %xmm0
pinsrd $0, %esi, %xmm1
punpcklqdq %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0]
retq
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D60852/new/
https://reviews.llvm.org/D60852
More information about the llvm-commits
mailing list