[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Chandler Carruth chandlerc at gmail.com
Sat Sep 6 03:47:20 PDT 2014


FYI, this is all fixed. =] Sorry for the trouble, was a silly goof that
should have been caught sooner.

On Fri, Sep 5, 2014 at 11:09 AM, Robert Lougher <rob.lougher at gmail.com>
wrote:

> Hi Chandler,
>
> On 5 September 2014 17:38, Chandler Carruth <chandlerc at gmail.com> wrote:
> >
> > On Fri, Sep 5, 2014 at 9:32 AM, Robert Lougher <rob.lougher at gmail.com>
> > wrote:
> >>
> >> Unfortunately, another team, while doing internal testing has seen the
> >> new path generating illegal insertps masks.  A sample here:
> >>
> >>     vinsertps    $256, %xmm0, %xmm13, %xmm4 # xmm4 =
> xmm0[0],xmm13[1,2,3]
> >>     vinsertps    $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
> >>     vinsertps    $256, %xmm13, %xmm1, %xmm7 # xmm7 =
> xmm13[0],xmm1[1,2,3]
> >>     vinsertps    $416, %xmm1, %xmm4, %xmm14 # xmm14 =
> >> xmm4[0,1],xmm1[2],xmm4[3]
> >>     vinsertps    $416, %xmm13, %xmm6, %xmm13 # xmm13 =
> >> xmm6[0,1],xmm13[2],xmm6[3]
> >>     vinsertps    $416, %xmm0, %xmm7, %xmm0 # xmm0 =
> >> xmm7[0,1],xmm0[2],xmm7[3]
> >>
> >> We'll continue to look into this and do additional testing.
> >
> >
> > Interesting. Let me know if you get a test case. The insertps code path
> was
> > added recently though and has been much less well tested. I'll start fuzz
> > testing it and should hopefully uncover the bug.
>
> Here's two small test cases.  Hope they are of use.
>
> Thanks,
> Rob.
>
> ------
> define <4 x float> @test(<4 x float> %xyzw, <4 x float> %abcd) {
>   %1 = extractelement <4 x float> %xyzw, i32 0
>   %2 = insertelement <4 x float> undef, float %1, i32 0
>   %3 = insertelement <4 x float> %2, float 0.000000e+00, i32 1
>   %4 = shufflevector <4 x float> %3, <4 x float> %xyzw, <4 x i32> <i32
> 0, i32 1, i32 6, i32 undef>
>   %5 = shufflevector <4 x float> %4, <4 x float> %abcd, <4 x i32> <i32
> 0, i32 1, i32 2, i32 4>
>   ret <4 x float> %5
> }
>
> define <4 x float> @test2(<4 x float> %xyzw, <4 x float> %abcd) {
>   %1 = shufflevector <4 x float> %xyzw, <4 x float> %abcd, <4 x i32>
> <i32 0, i32 undef, i32 2, i32 4>
>   %2 = shufflevector <4 x float> <float undef, float 0.000000e+00,
> float undef, float undef>, <4 x float> %1, <4 x i32> <i32 4, i32 1,
> i32 6, i32 7>
>   ret <4 x float> %2
> }
>
>
> llc -march=x86-64 -mattr=+avx test.ll -o -
>
> test:                                   # @test
>     vxorps    %xmm2, %xmm2, %xmm2
>     vmovss    %xmm0, %xmm2, %xmm2
>     vblendps    $4, %xmm0, %xmm2, %xmm0 # xmm0 = xmm2[0,1],xmm0[2],xmm2[3]
>     vinsertps    $48, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>     retl
>
> test2:                                  # @test2
>     vinsertps    $48, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>     vxorps    %xmm1, %xmm1, %xmm1
>     vblendps    $13, %xmm0, %xmm1, %xmm0 # xmm0 = xmm0[0],xmm1[1],xmm0[2,3]
>     retl
>
> llc -march=x86-64 -mattr=+avx
> -x86-experimental-vector-shuffle-lowering test.ll -o -
>
> test:                                   # @test
>     vinsertps    $270, %xmm0, %xmm0, %xmm2 # xmm2 = xmm0[0],zero,zero,zero
>     vinsertps    $416, %xmm0, %xmm2, %xmm0 # xmm0 =
> xmm2[0,1],xmm0[2],xmm2[3]
>     vinsertps    $304, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>     retl
>
> test2:                                  # @test2
>     vinsertps    $304, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>     vxorps    %xmm1, %xmm1, %xmm1
>     vinsertps    $336, %xmm1, %xmm0, %xmm0 # xmm0 =
> xmm0[0],xmm1[1],xmm0[2,3]
>     retl
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140906/c7290f88/attachment.html>


More information about the llvm-dev mailing list