[llvm-commits] [llvm] r162195 - /llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Mon Aug 20 01:46:22 PDT 2012

Hi Stepan,

> Author: dyatkovskiy
> Date: Mon Aug 20 02:57:06 2012
> New Revision: 162195
>
> URL: http://llvm.org/viewvc/llvm-project?rev=162195&view=rev
> Log:
> Fixed DAGCombiner bug (found and localized by James Malloy):
> The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
> consists purely of get_vector_elts from one or two source vectors. If
> so, it either makes a concat_vectors node or a shufflevector node.
>
> However, it doesn't check the element type width of the underlying
> vector, so if you have this sequence:
>
> Node0: v4i16 = ...
> Node1: i32 = extract_vector_elt Node0
> Node2: i32 = extract_vector_elt Node0
> Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...
>
> It will attempt to:
>
> Node0:    v4i16 = ...
> NewNode1: v16i8 = concat_vectors Node0, ...
>
> Where this is actually invalid because the element width is completely
> different. This causes an assertion failure on DAG legalization stage.
>
> Fix:
> If output item type of BUILD_VECTOR differs from input item type.
> Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
> Node0:    v4i16 = ...
> NewNode1: v8i16 = concat_vectors Node0, ...
> NewNode2: v16i8 = bitcast NewNode1

using bitcast seems completely wrong to me, it should be a truncation in your
example.  Suppose the original v4i16 vector contains the elements <256, 256,
...>

Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0

Here Node1 will hold the number 256 in the lower 16 bits, and possibly some junk
in the upper 16 bits.  I'm assuming you extract element 0 here.

Node2: i32 = extract_vector_elt Node0

Likewise for Node2, assuming you extract element 1 here.

Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...

Since the type (i32) of Node1 and Node2 is bigger than the element type of the
vector (i8), their values are implicitly truncated to i8, so the final vector
Node3 holds <0, 0, ...>.

This is documented in llvm/CodeGen/ISDOpcodes.h.

Now consider the new code:

Node0:    v4i16 = ...

Holds <256, 256, ...>

NewNode1: v8i16 = concat_vectors Node0, ...

Holds <256, 256, ...>

NewNode2: v16i8 = bitcast NewNode1

Holds <0, 1, 0, 1, ...> which is different to in the original code.
So why do I say NewNode2 holds this value?  Bitcast doesn't change
bits, add any or remove any, it just views them in the new type.
Previously we had the following bits (I'm assuming a little-endian
machine):

   00000000 10000000 00000000 10000000
   \---------------/ \---------------/
   16 bit value 256  16 bit value 256

By doing a bitcast you are now viewing it as a series of 8 bit values:

   00000000 10000000 00000000 10000000
   \------/ \------/ \------/ \------/
    8 bits   8 bits   8 bits   8 bits

These 8 bit values are: 0, 1, 0, 1.

You need to truncate from v16i16 to v16i8, not bitcast from v8i16 to
v16i8.

Ciao, Duncan.

PS: You may disagree about whether the final values are 0, 1, 0, 1 or
0, 128, 0, 128.  That's because I didn't describe things completely
precisely.  However it doesn't matter, the point is that using bitcast
will give a wrong result.

>
>
> Modified:
>      llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=162195&r1=162194&r2=162195&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Mon Aug 20 02:57:06 2012
> @@ -7876,9 +7876,29 @@
>         if (VecIn1.getValueType().getSizeInBits()*2 != VT.getSizeInBits())
>           return SDValue();
>
> -      // Widen the input vector by adding undef values.
> -      VecIn1 = DAG.getNode(ISD::CONCAT_VECTORS, N->getDebugLoc(), VT,
> -                           VecIn1, DAG.getUNDEF(VecIn1.getValueType()));
> +      // If the element type of the input vector is not the same as
> +      // the output element type, make concat_vectors based on input element
> +      // type and then bitcast it to the output vector type.
> +      //
> +      // In another words avoid nodes like this:
> +      //  <NODE> v16i8 = concat_vectors v4i16 v4i16
> +      // Replace it with this one:
> +      //  <NODE0> v8i16 = concat_vectors v4i16 v4i16
> +      //  <NODE1> v16i8 = bitcast NODE0
> +      EVT ItemType = VecIn1.getValueType().getVectorElementType();
> +      if (ItemType != VT.getVectorElementType()) {
> +        EVT ConcatVT = EVT::getVectorVT(*DAG.getContext(),
> +                                ItemType,
> +                                VecIn1.getValueType().getVectorNumElements()*2);
> +        // Widen the input vector by adding undef values.
> +        VecIn1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, ConcatVT,
> +                             VecIn1, DAG.getUNDEF(VecIn1.getValueType()));
> +        VecIn1 = DAG.getNode(ISD::BITCAST, dl, VT, VecIn1);
> +      } else
> +        // Widen the input vector by adding undef values.
> +        VecIn1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, VT,
> +                             VecIn1, DAG.getUNDEF(VecIn1.getValueType()));
> +
>       }
>
>       // If VecIn2 is unused then change it to undef.
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>