[llvm-commits] [PATCH] Fix assertion failure with shufflevectors on ARM

Mon Sep 10 00:52:38 PDT 2012

ping
Stepan Dyatkovskiy wrote:
> ping.
>  >>>> Hi all!
>>>>> Due to Duncan remarks for r162195:
>>>>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120820/148595.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I reverted this patch and proceed debugging the problem. Now it seems
>>>>> that I found proper places for fix.
>>>>>
>>>>> I found that crashes, described by James in initial post are happened
>>>>> due to two bugs.
>>>>>
>>>>> Bug #1. We insert wrong concat_vectors operation while expanding
>>>>> build_vector in DAGCombiner::visitBUILD_VECTOR:
>>>>> v16i8 concat_vectors v4i16, v4i16
>>>>> That happens due to improperly formed condition: when we can expand
>>>>> build_vector and when we can't. We support widening of vectors which
>>>>> are
>>>>> half the size of the output registers.
>>>>> In TOT you can found the check:
>>>>> if (VecIn1.getValueType().getSizeInBits()*2 != VT.getSizeInBits())
>>>>>    return SDValue()
>>>>> I think, it is wrong, since we should check number of elements instead
>>>>> of bit width. This check doesn't take into account input and output
>>>>> vectors number. It also takes into account bit width, but it doesn't
>>>>> matter: if we produce illegal operation (input width != output
>>>>> width) it
>>>>> will be legalized later.
>>>>>
>>>>> Bug #2. Fired when LegalizeDAG tries to produce extract_vector_elt
>>>>> with
>>>>> operands that are build_vector nodes. getNode tries to return
>>>>> build_vector's operand in that case, but it also inserts the nodes,
>>>>> that
>>>>> performs high bits cleaning from garbage:
>>>>> Node0: i32 some-node-here
>>>>> Node1: v2i16 build_vector Node0, Node0
>>>>>
>>>>> // Current nodes sequence inserted when we invoke
>>>>> // getNode(EXTRACT_VECTOR_ELT, dl, i32, Node0, idx=0)
>>>>> Node1: i16 truncate Node0
>>>>> Node2: i32 any_extend Node1
>>>>> // return Node2 instead of extract_vector_elt node.
>>>>>
>>>>> I'm 99,9(9)% sure, that it is wrong way, due to next reasons:
>>>>>   1. 16 may be illegal type (even if v2i16 is legal), so, at least you
>>>>> need to check  CombineLevel, it should be BeforeLegalizeTypes. In
>>>>> another case you will crash the LegalizeDAG.
>>>>>   2. What if initially elements were i8 and then was transformed since
>>>>> target is 32bit machine?
>>>>>   Node0: i8 some-node-here
>>>>>   Node1: v2i8 build_vector Node0, NodeItem1, NodeItem2, NodeItem3
>>>>>   After type legalization we got current state:
>>>>>   Node0: i32 some-node-here
>>>>>   Node1: v2i16 build_vector Node0, NodeItem1, NodeItem2, NodeItem3
>>>>>   But you can't determine initial type (was it 8bit or may be
>>>>> 4bit?) on
>>>>> this stage. So 'i16 = truncate Node0' will not guarantee high bits
>>>>> cleanness.
>>>>>   3. build_vector operands were already extended to
>>>>>   machine word width (32 bit in our case) with some way. We needn't do
>>>>>   the same job twice.
>>>>>
>>>>> Please find the fix in attachment.
>>>>>
>>>>> -Stepan.
>>>>>
>>>>> Eli Friedman wrote:
>>>>>> On Wed, Aug 15, 2012 at 2:55 AM, Stepan Dyatkovskiy
>>>>>> <stpworld at narod.ru> wrote:
>>>>>>> Hi Eli,
>>>>>>> Please review the patch with bitcasting after concat_vectors.
>>>>>>
>>>>>> Looks fine.
>>>>>>
>>>>>> -Eli
>>>>>>
>>>>>
>>>>
>>>
>>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits