[llvm-commits] [llvm] r147308 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/avx-shuffle.ll

Wed Jan 4 00:50:10 PST 2012

I fixed two bugs together because after I fixed VECTOR_SHUFFLE with MOVL mask the shuffle node was not lowered and it came to BUILD_VECTOR and failed there.

The MOVL / MOVLP instructions work for 128-bit vectors. I'm running conformance tests on AVX and see that many shuffle cases generate wrong instruction set. I'm trying to fix one-by-one and I have several fixes if one fix causes fail in another place.

- Elena

-----Original Message-----
From: Chad Rosier [mailto:mcrosier at apple.com] 
Sent: Wednesday, January 04, 2012 03:30
To: Demikhovsky, Elena
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [llvm] r147308 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/avx-shuffle.ll

Hi Elena,
Just a few comments below.

On Dec 28, 2011, at 12:14 AM, Elena Demikhovsky wrote:

> Author: delena
> Date: Wed Dec 28 02:14:01 2011
> New Revision: 147308
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=147308&view=rev
> Log:
> Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR.
> Matching MOVLP mask for AVX (265-bit vectors) was wrong.
> The failure was detected by conformance tests.

As far as I can tell this patch is addressing two bugs, one in LowerBUILD_VECTOR and another with LowerVECTOR_SHUFFLE/MOVLP.  These should really be fixed in two separate commits.

> 
> Modified:
>    llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>    llvm/trunk/test/CodeGen/X86/avx-shuffle.ll
> 
> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147308&r1=147307&r2=147308&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 28 02:14:01 2011
> @@ -3448,6 +3448,11 @@
> /// isMOVLPMask - Return true if the specified VECTOR_SHUFFLE operand
> /// specifies a shuffle of elements that is suitable for input to MOVLP{S|D}.
> bool X86::isMOVLPMask(ShuffleVectorSDNode *N) {
> +  EVT VT = N->getValueType(0);
> +
> +  if (VT.getSizeInBits() != 128)
> +    return false;
> +

I'm not sure I agree with the approach taken here...

>   unsigned NumElems = N->getValueType(0).getVectorNumElements();
> 
>   if (NumElems != 2 && NumElems != 4)
> @@ -3666,6 +3671,8 @@
> static bool isMOVLMask(const SmallVectorImpl<int> &Mask, EVT VT) {
>   if (VT.getVectorElementType().getSizeInBits() < 32)
>     return false;
> +  if (VT.getSizeInBits() == 256)
> +    return false;

..and here.  Aren't these functions behaving correctly?  I believe the real problem is the code that is predicated upon these functions; they don't fully support codegen when AVX is enabled.

And one moot point..  The conditional statements that checks the size of the vectors could be more consistent (i.e., both (VT.getSizeInBits != 128) or (VT.getSizeInBits != 128), but not a combination of the two).

>   int NumElts = VT.getVectorNumElements();
> 
> @@ -5158,16 +5165,30 @@
>         return DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item);
>       } else if (ExtVT == MVT::i32 || ExtVT == MVT::f32 || ExtVT == MVT::f64 ||
>           (ExtVT == MVT::i64 && Subtarget->is64Bit())) {
> +        if (VT.getSizeInBits() == 256) {
> +          
> +          EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2);
> +          Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item);
> +          SDValue ZeroVec = getZeroVector(VT, true, DAG, dl);              
> +          return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32),
> +                              DAG, dl);
> +        }
>         Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item);
>         // Turn it into a MOVL (i.e. movss, movsd, or movd) to a zero vector.
>         return getShuffleVectorZeroOrUndef(Item, 0, true,Subtarget->hasXMMInt(),
>                                            DAG);
>       } else if (ExtVT == MVT::i16 || ExtVT == MVT::i8) {
>         Item = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Item);
> -        unsigned NumBits = VT.getSizeInBits();
> -        assert((NumBits == 128 || NumBits == 256) && 
> -               "Expected an SSE or AVX value type!");
> -        EVT MiddleVT = NumBits == 128 ? MVT::v4i32 : MVT::v8i32;
> +        if (VT.getSizeInBits() == 256) {
> +          
> +          EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2);
> +          Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item);
> +          SDValue ZeroVec = getZeroVector(VT, true, DAG, dl);              
> +          return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32),
> +                              DAG, dl);
> +        }
> +        assert (VT.getSizeInBits() == 128 || "Expected an SSE value type!");
> +        EVT MiddleVT = MVT::v4i32;
>         Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MiddleVT, Item);
>         Item = getShuffleVectorZeroOrUndef(Item, 0, true,
>                                            Subtarget->hasXMMInt(), DAG);
> 
> Modified: llvm/trunk/test/CodeGen/X86/avx-shuffle.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-shuffle.ll?rev=147308&r1=147307&r2=147308&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/avx-shuffle.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/avx-shuffle.ll Wed Dec 28 02:14:01 2011
> @@ -13,8 +13,22 @@
> define <3 x i64> @test2(<2 x i64> %v) nounwind readnone {
> ; CHECK: test2:
> ; CHECK: vxorpd
> -; CHECK: vmovsd
> +; CHECK: vperm2f128
>   %1 = shufflevector <2 x i64> %v, <2 x i64> %v, <3 x i32> <i32 0, i32 1, i32 undef>
>   %2 = shufflevector <3 x i64> zeroinitializer, <3 x i64> %1, <3 x i32> <i32 3, i32 4, i32 2>
>   ret <3 x i64> %2
> }
> +
> +define <4 x i64> @test3(<4 x i64> %a, <4 x i64> %b) nounwind {
> +  %c = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 4, i32 5, i32 2, i32 undef>
> +  ret <4 x i64> %c
> +; CHECK: test3:
> +; CHECK: vperm2f128
> +}
> +
> +define <8 x float> @test4(float %a) nounwind {
> +  %b = insertelement <8 x float> zeroinitializer, float %a, i32 0
> +  ret <8 x float> %b
> +; CHECK: test4:
> +; CHECK: vinsertf128
> +}
> \ No newline at end of file
> 

Please make sure your test case ends with a newline.  This was fixed in r147481.

 Chad

> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.