[llvm-commits] [llvm] r137296 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/avx-splat.ll

Thu Aug 11 10:00:54 PDT 2011

Hi Bruno, 

I made two AVX-related commits. One of the optimizations will come in handy once you implement the splitting and merging of unsupported integer operations. 

Nadav

-----Original Message-----
From: Bruno Cardoso Lopes [mailto:bruno.cardoso at gmail.com] 
Sent: Thursday, August 11, 2011 19:50
To: Rotem, Nadav
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [llvm] r137296 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/avx-splat.ll

On Wed, Aug 10, 2011 at 11:54 PM, Rotem, Nadav <nadav.rotem at intel.com> wrote:
> Bruno,
>
> Your implementation is great for scalars. But I think that splat vector loads should be handled by vbroadcast.

They will, coming soon! Remember it's a WIP. Also we want to handle
cases where the scalar doesn't come from memory.

> Thanks,
> Nadav
>
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Bruno Cardoso Lopes
> Sent: Thursday, August 11, 2011 05:50
> To: llvm-commits at cs.uiuc.edu
> Subject: [llvm-commits] [llvm] r137296 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/avx-splat.ll
>
> Author: bruno
> Date: Wed Aug 10 21:49:44 2011
> New Revision: 137296
>
> URL: http://llvm.org/viewvc/llvm-project?rev=137296&view=rev
> Log:
> Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing
> infinite recursive calls in legalize. Fix PR10562
>
> Modified:
>    llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>    llvm/trunk/test/CodeGen/X86/avx-splat.ll
>
> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=137296&r1=137295&r2=137296&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Aug 10 21:49:44 2011
> @@ -4066,11 +4066,11 @@
>   return DAG.getVectorShuffle(VT, dl, V1, V2, &Mask[0]);
>  }
>
> -// PromoteSplatv8v16 - All i16 and i8 vector types can't be used directly by
> +// PromoteSplati8i16 - All i16 and i8 vector types can't be used directly by
>  // a generic shuffle instruction because the target has no such instructions.
>  // Generate shuffles which repeat i16 and i8 several times until they can be
>  // represented by v4f32 and then be manipulated by target suported shuffles.
> -static SDValue PromoteSplatv8v16(SDValue V, SelectionDAG &DAG, int &EltNo) {
> +static SDValue PromoteSplati8i16(SDValue V, SelectionDAG &DAG, int &EltNo) {
>   EVT VT = V.getValueType();
>   int NumElems = VT.getVectorNumElements();
>   DebugLoc dl = V.getDebugLoc();
> @@ -4162,8 +4162,9 @@
>   }
>
>   // Make this 128-bit vector duplicate i8 and i16 elements
> -  if (NumElems > 4)
> -    V1 = PromoteSplatv8v16(V1, DAG, EltNo);
> +  EVT EltVT = SrcVT.getVectorElementType();
> +  if (NumElems > 4 && (EltVT == MVT::i8 || EltVT == MVT::i16))
> +    V1 = PromoteSplati8i16(V1, DAG, EltNo);
>
>   // Recreate the 256-bit vector and place the same 128-bit vector
>   // into the low and high part. This is necessary because we want
> @@ -6027,8 +6028,7 @@
>       return PromoteVectorToScalarSplat(SVOp, DAG);
>
>     // Handle splats by matching through known shuffle masks
> -    if ((VT.is128BitVector() && NumElem <= 4) ||
> -        (VT.is256BitVector() && NumElem <= 8))
> +    if (VT.is128BitVector() && NumElem <= 4)
>       return SDValue();
>
>     // All i16 and i8 vector types can't be used directly by a generic shuffle
>
> Modified: llvm/trunk/test/CodeGen/X86/avx-splat.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-splat.ll?rev=137296&r1=137295&r2=137296&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/avx-splat.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/avx-splat.ll Wed Aug 10 21:49:44 2011
> @@ -51,8 +51,9 @@
>  ; To:
>  ;   shuffle (vload ptr)), undef, <1, 1, 1, 1>
>  ; CHECK: vmovaps
> -; CHECK-NEXT: vpextrd
> -define void @funcE() nounwind {
> +; CHECK-NEXT: vinsertf128  $1
> +; CHECK-NEXT: vpermilps $-1
> +define <8 x float> @funcE() nounwind {
>  allocas:
>   %udx495 = alloca [18 x [18 x float]], align 32
>   br label %for_test505.preheader
> @@ -74,7 +75,7 @@
>
>  __load_and_broadcast_32.exit1249:                 ; preds = %load.i1247, %for_exit499
>   %load_broadcast12281250 = phi <8 x float> [ %phitmp, %load.i1247 ], [ undef, %for_exit499 ]
> -  ret void
> +  ret <8 x float> %load_broadcast12281250
>  }
>
>  ; CHECK: vpshufd  $0
> @@ -87,3 +88,20 @@
>   ret <8 x float> %tmp
>  }
>
> +; CHECK: vinsertf128  $1
> +; CHECK-NEXT: vpermilps  $0
> +define <8 x float> @funcG(<8 x float> %a) nounwind uwtable readnone ssp {
> +entry:
> +  %shuffle = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
> +  ret <8 x float> %shuffle
> +}
> +
> +; CHECK: vextractf128  $1
> +; CHECK-NEXT: vinsertf128  $1
> +; CHECK-NEXT: vpermilps  $85
> +define <8 x float> @funcH(<8 x float> %a) nounwind uwtable readnone ssp {
> +entry:
> +  %shuffle = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> <i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5>
> +  ret <8 x float> %shuffle
> +}
> +
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>

-- 
Bruno Cardoso Lopes
http://www.brunocardoso.cc
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.