[llvm-commits] [llvm] r162195 - /llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Mon Aug 20 12:33:05 PDT 2012

Thanks! Very reasonable remarks. I'll try to make one more patch then ) 
-Stepan.

Duncan Sands wrote:
> Hi Stepan,
>
>> Author: dyatkovskiy
>> Date: Mon Aug 20 02:57:06 2012
>> New Revision: 162195
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=162195&view=rev
>> Log:
>> Fixed DAGCombiner bug (found and localized by James Malloy):
>> The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
>> consists purely of get_vector_elts from one or two source vectors. If
>> so, it either makes a concat_vectors node or a shufflevector node.
>>
>> However, it doesn't check the element type width of the underlying
>> vector, so if you have this sequence:
>>
>> Node0: v4i16 = ...
>> Node1: i32 = extract_vector_elt Node0
>> Node2: i32 = extract_vector_elt Node0
>> Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...
>>
>> It will attempt to:
>>
>> Node0:    v4i16 = ...
>> NewNode1: v16i8 = concat_vectors Node0, ...
>>
>> Where this is actually invalid because the element width is completely
>> different. This causes an assertion failure on DAG legalization stage.
>>
>> Fix:
>> If output item type of BUILD_VECTOR differs from input item type.
>> Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
>> Node0:    v4i16 = ...
>> NewNode1: v8i16 = concat_vectors Node0, ...
>> NewNode2: v16i8 = bitcast NewNode1
>
> using bitcast seems completely wrong to me, it should be a truncation in your
> example.  Suppose the original v4i16 vector contains the elements <256, 256,
> ...>
>
> Node0: v4i16 = ...
> Node1: i32 = extract_vector_elt Node0
>
> Here Node1 will hold the number 256 in the lower 16 bits, and possibly some junk
> in the upper 16 bits.  I'm assuming you extract element 0 here.
>
> Node2: i32 = extract_vector_elt Node0
>
> Likewise for Node2, assuming you extract element 1 here.
>
> Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...
>
> Since the type (i32) of Node1 and Node2 is bigger than the element type of the
> vector (i8), their values are implicitly truncated to i8, so the final vector
> Node3 holds <0, 0, ...>.
>
> This is documented in llvm/CodeGen/ISDOpcodes.h.
>
> Now consider the new code:
>
> Node0:    v4i16 = ...
>
> Holds <256, 256, ...>
>
> NewNode1: v8i16 = concat_vectors Node0, ...
>
> Holds <256, 256, ...>
>
> NewNode2: v16i8 = bitcast NewNode1
>
> Holds <0, 1, 0, 1, ...> which is different to in the original code.
> So why do I say NewNode2 holds this value?  Bitcast doesn't change
> bits, add any or remove any, it just views them in the new type.
> Previously we had the following bits (I'm assuming a little-endian
> machine):
>
>     00000000 10000000 00000000 10000000
>     \---------------/ \---------------/
>     16 bit value 256  16 bit value 256
>
> By doing a bitcast you are now viewing it as a series of 8 bit values:
>
>     00000000 10000000 00000000 10000000
>     \------/ \------/ \------/ \------/
>      8 bits   8 bits   8 bits   8 bits
>
> These 8 bit values are: 0, 1, 0, 1.
>
> You need to truncate from v16i16 to v16i8, not bitcast from v8i16 to
> v16i8.
>
> Ciao, Duncan.
>
> PS: You may disagree about whether the final values are 0, 1, 0, 1 or
> 0, 128, 0, 128.  That's because I didn't describe things completely
> precisely.  However it doesn't matter, the point is that using bitcast
> will give a wrong result.
>
>>
>>
>> Modified:
>>       llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>>
>> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=162195&r1=162194&r2=162195&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Mon Aug 20 02:57:06 2012
>> @@ -7876,9 +7876,29 @@
>>          if (VecIn1.getValueType().getSizeInBits()*2 != VT.getSizeInBits())
>>            return SDValue();
>>
>> -      // Widen the input vector by adding undef values.
>> -      VecIn1 = DAG.getNode(ISD::CONCAT_VECTORS, N->getDebugLoc(), VT,
>> -                           VecIn1, DAG.getUNDEF(VecIn1.getValueType()));
>> +      // If the element type of the input vector is not the same as
>> +      // the output element type, make concat_vectors based on input element
>> +      // type and then bitcast it to the output vector type.
>> +      //
>> +      // In another words avoid nodes like this:
>> +      //  <NODE> v16i8 = concat_vectors v4i16 v4i16
>> +      // Replace it with this one:
>> +      //  <NODE0> v8i16 = concat_vectors v4i16 v4i16
>> +      //  <NODE1> v16i8 = bitcast NODE0
>> +      EVT ItemType = VecIn1.getValueType().getVectorElementType();
>> +      if (ItemType != VT.getVectorElementType()) {
>> +        EVT ConcatVT = EVT::getVectorVT(*DAG.getContext(),
>> +                                ItemType,
>> +                                VecIn1.getValueType().getVectorNumElements()*2);
>> +        // Widen the input vector by adding undef values.
>> +        VecIn1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, ConcatVT,
>> +                             VecIn1, DAG.getUNDEF(VecIn1.getValueType()));
>> +        VecIn1 = DAG.getNode(ISD::BITCAST, dl, VT, VecIn1);
>> +      } else
>> +        // Widen the input vector by adding undef values.
>> +        VecIn1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, VT,
>> +                             VecIn1, DAG.getUNDEF(VecIn1.getValueType()));
>> +
>>        }
>>
>>        // If VecIn2 is unused then change it to undef.
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>