[PATCH] Vselect improvements part 2

Nadav Rotem nrotem at apple.com
Tue Jun 25 00:12:35 PDT 2013


Matt, 

This change looks incorrect.  The function SplitVecOp_VSETCC should split setcc nodes. Your change makes it convert the i1s return type into OutVT which may not be a legal type. The last part is especially suspicious. Why are you handling VSETCCs that return vector i1s or something else ? The VSETCCs need to be consistent. I think that at this phase they are all i1s.  

Nadav  

On Jun 24, 2013, at 11:55 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote:

>  Update failing ARM test.
> 
>  This patch changes the result of splitting a vsetcc. Before this patch, the boolean vector is constructed and then legalized with PromoteTargetBoolean. This instead changes the type of the setccs and concats them.
> 
>  For example, in the changed X86 test this would originally transform
>  v8i16 = setcc (v8f32 = build_vector ...) (v8f32 = build_vector ...)
>  to
>  v8i16 = sign_extend (concat_vectors (v4i1 = setcc ...) (v4i1 = setcc ...))
> 
>  This patch changes the transformation to
>  v8i16 = concat_vectors (v4i16 = setcc ...) (v4i16 = setcc ...)
> 
>  which no longer has the sign_extend on the reconstructed vector. Since the only meaningful result comes out of the lower half of the vector due to the extractelement, the upper half is eventually removed this way, which didn't happen before.
> 
>  Perhaps this would be better handled somewhere in DAGCombiner? It seems like something else should be shrinking vectors when halves of them aren't used.
> 
> Hi nadav,
> 
> http://llvm-reviews.chandlerc.com/D904
> 
> CHANGE SINCE LAST DIFF
>  http://llvm-reviews.chandlerc.com/D904?vs=2218&id=2551#toc
> 
> Files:
>  lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
>  test/CodeGen/ARM/vselect_imax.ll
>  test/CodeGen/X86/2011-10-21-widen-cmp.ll
> 
> Index: lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> ===================================================================
> --- lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> +++ lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
> @@ -1363,16 +1363,21 @@
>   SDLoc DL(N);
>   GetSplitVector(N->getOperand(0), Lo0, Hi0);
>   GetSplitVector(N->getOperand(1), Lo1, Hi1);
> -  unsigned PartElements = Lo0.getValueType().getVectorNumElements();
> -  EVT PartResVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1, PartElements);
> -  EVT WideResVT = EVT::getVectorVT(*DAG.getContext(), MVT::i1, 2*PartElements);
> 
> -  LoRes = DAG.getNode(ISD::SETCC, DL, PartResVT, Lo0, Lo1, N->getOperand(2));
> -  HiRes = DAG.getNode(ISD::SETCC, DL, PartResVT, Hi0, Hi1, N->getOperand(2));
> -  SDValue Con = DAG.getNode(ISD::CONCAT_VECTORS, DL, WideResVT, LoRes, HiRes);
> -  return PromoteTargetBoolean(Con, N->getValueType(0));
> -}
> +  EVT ResVT = N->getValueType(0);
> +  EVT InVT = Lo0.getValueType();
> 
> +  EVT OutVT = EVT::getVectorVT(*DAG.getContext(),
> +                               ResVT.getVectorElementType(),
> +                               InVT.getVectorNumElements());
> +
> +  LoRes = DAG.getNode(ISD::SETCC, DL, OutVT, Lo0, Lo1, N->getOperand(2));
> +  HiRes = DAG.getNode(ISD::SETCC, DL, OutVT, Hi0, Hi1, N->getOperand(2));
> +  SDValue Con = DAG.getNode(ISD::CONCAT_VECTORS, DL, ResVT, LoRes, HiRes);
> +  return ResVT.getVectorElementType() == MVT::i1
> +       ? PromoteTargetBoolean(Con, N->getValueType(0))
> +       : Con;
> +}
> 
> SDValue DAGTypeLegalizer::SplitVecOp_FP_ROUND(SDNode *N) {
>   // The result has a legal vector type, but the input needs splitting.
> Index: test/CodeGen/ARM/vselect_imax.ll
> ===================================================================
> --- test/CodeGen/ARM/vselect_imax.ll
> +++ test/CodeGen/ARM/vselect_imax.ll
> @@ -20,11 +20,12 @@
>   %v0 = load %T0_10* %loadaddr
>   %v1 = load %T0_10* %loadaddr2
>   %c = icmp slt %T0_10 %v0, %v1
> -; CHECK: vst1
> -; CHECK: vst1
> -; CHECK: vst1
> -; CHECK: vst1
> -; CHECK: vld
> +; CHECK: vcgt.s16
> +; CHECK: vbsl
> +; CHECK: vcgt.s16
> +; CHECK: vbsl
> +; CHECK: vst1.64
> +; CHECK: vst1.64
> ; COST: func_blend10
> ; COST: cost of 40 {{.*}} select
>   %r = select %T1_10 %c, %T0_10 %v0, %T0_10 %v1
> @@ -39,10 +40,12 @@
>   %v0 = load %T0_14* %loadaddr
>   %v1 = load %T0_14* %loadaddr2
>   %c = icmp slt %T0_14 %v0, %v1
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> +; CHECK: vcgt.s32
> +; CHECK: vbsl
> +; CHECK: vcgt.s32
> +; CHECK: vbsl
> +; CHECK: vst1.64
> +; CHECK: vst1.64
> ; COST: func_blend14
> ; COST: cost of 41 {{.*}} select
>   %r = select %T1_14 %c, %T0_14 %v0, %T0_14 %v1
> @@ -57,10 +60,10 @@
>   %v0 = load %T0_15* %loadaddr
>   %v1 = load %T0_15* %loadaddr2
>   %c = icmp slt %T0_15 %v0, %v1
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> +; CHECK: vst1.64
> +; CHECK: vst1.64
> +; CHECK: vst1.64
> +; CHECK: vst1.64
> ; COST: func_blend15
> ; COST: cost of 82 {{.*}} select
>   %r = select %T1_15 %c, %T0_15 %v0, %T0_15 %v1
> @@ -75,10 +78,10 @@
>   %v0 = load %T0_18* %loadaddr
>   %v1 = load %T0_18* %loadaddr2
>   %c = icmp slt %T0_18 %v0, %v1
> -; CHECK: strh
> -; CHECK: strh
> -; CHECK: strh
> -; CHECK: strh
> +; CHECK: vbsl
> +; CHECK: vst1.64
> +; CHECK: vbsl
> +; CHECK: vst1.64
> ; COST: func_blend18
> ; COST: cost of 19 {{.*}} select
>   %r = select %T1_18 %c, %T0_18 %v0, %T0_18 %v1
> @@ -93,10 +96,10 @@
>   %v0 = load %T0_19* %loadaddr
>   %v1 = load %T0_19* %loadaddr2
>   %c = icmp slt %T0_19 %v0, %v1
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> ; COST: func_blend19
> ; COST: cost of 50 {{.*}} select
>   %r = select %T1_19 %c, %T0_19 %v0, %T0_19 %v1
> @@ -111,10 +114,14 @@
>   %v0 = load %T0_20* %loadaddr
>   %v1 = load %T0_20* %loadaddr2
>   %c = icmp slt %T0_20 %v0, %v1
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> -; CHECK: strb
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> +; CHECK: vbsl
> ; COST: func_blend20
> ; COST: cost of 100 {{.*}} select
>   %r = select %T1_20 %c, %T0_20 %v0, %T0_20 %v1
> Index: test/CodeGen/X86/2011-10-21-widen-cmp.ll
> ===================================================================
> --- test/CodeGen/X86/2011-10-21-widen-cmp.ll
> +++ test/CodeGen/X86/2011-10-21-widen-cmp.ll
> @@ -30,7 +30,11 @@
> }
> 
> ; CHECK: mp_11193
> -; CHECK: psraw   $15
> +; CHECK: cmpltps
> +; CHECK: pextrb $0
> +; CHECK: shlb $7
> +; CHECK: sarb $7
> +; CHECK: cvtsi2ssl
> ; CHECK: ret
> define void @mp_11193(<8 x float> * nocapture %aFOO, <8 x float>* nocapture %RET)
> nounwind {
> <D904.2.patch>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130625/faed4d17/attachment.html>


More information about the llvm-commits mailing list