[PATCH] [DAGCombine] Fix a bug in a BUILD_VECTOR combine

Tue Mar 3 13:08:48 PST 2015

LGTM. 

> On Mar 3, 2015, at 12:57 PM, Michael Kuperstein <michael.m.kuperstein at intel.com> wrote:
> 
> Hi spatel, andreadb,
> 
> When trying to convert a BUILD_VECTOR into a shuffle, we try to split vectors that are twice as wide as the destination. 
> We can not do this when we also need the zero vector to create a blend.
> 
> This fixes PR22774.
> 
> http://reviews.llvm.org/D8040
> 
> Files:
>  lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>  test/CodeGen/X86/pr22774.ll
> 
> Index: test/CodeGen/X86/pr22774.ll
> ===================================================================
> --- test/CodeGen/X86/pr22774.ll
> +++ test/CodeGen/X86/pr22774.ll
> @@ -0,0 +1,20 @@
> +; RUN: llc -mattr=avx %s -o - | FileCheck %s
> +
> +target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
> +target triple = "x86_64-pc-linux"
> +
> + at in = global <4 x i64> <i64 -1, i64 -1, i64 -1, i64 -1>, align 32
> + at out = global <2 x i64> zeroinitializer, align 16
> +
> +define i32 @_Z3foov() #0 {
> +entry:
> +; CHECK: vmovdqa in(%rip), %ymm0
> +; CHECK-NEXT: vmovq %xmm0, %xmm0
> +; CHECK-NEXT: vmovdqa %xmm0, out(%rip)
> +  %0 = load <4 x i64>, <4 x i64>* @in, align 32
> +  %vecext = extractelement <4 x i64> %0, i32 0
> +  %vecinit = insertelement <2 x i64> undef, i64 %vecext, i32 0
> +  %vecinit1 = insertelement <2 x i64> %vecinit, i64 0, i32 1
> +  store <2 x i64> %vecinit1, <2 x i64>* @out, align 16
> +  ret i32 0
> +}
> Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> ===================================================================
> --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> @@ -11361,7 +11361,9 @@
>       } else if (VecInT.getSizeInBits() == VT.getSizeInBits() * 2) {
>         // If the input vector is too large, try to split it.
>         // We don't support having two input vectors that are too large.
> -        if (VecIn2.getNode())
> +        // If the zero vector was used, we can not split the vector,
> +        // since we'd need 3 inputs.
> +        if (UsesZeroVector || VecIn2.getNode())
>           return SDValue();
> 
>         if (!TLI.isExtractSubvectorCheap(VT, VT.getVectorNumElements()))
> @@ -11373,7 +11375,6 @@
>           DAG.getConstant(VT.getVectorNumElements(), TLI.getVectorIdxTy()));
>         VecIn1 = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT, VecIn1,
>           DAG.getConstant(0, TLI.getVectorIdxTy()));
> -        UsesZeroVector = false;
>       } else
>         return SDValue();
>     }
> 
> EMAIL PREFERENCES
>  http://reviews.llvm.org/settings/panel/emailpreferences/
> <D8040.21131.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits