[llvm-commits] [PATCH] Fix assertion failure with shufflevectors on ARM
Stepan Dyatkovskiy
stpworld at narod.ru
Sun Aug 26 11:59:21 PDT 2012
Hi all!
Due to Duncan remarks for r162195:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120820/148595.html
I reverted this patch and proceed debugging the problem. Now it seems
that I found proper places for fix.
I found that crashes, described by James in initial post are happened
due to two bugs.
Bug #1. We insert wrong concat_vectors operation while expanding
build_vector in DAGCombiner::visitBUILD_VECTOR:
v16i8 concat_vectors v4i16, v4i16
That happens due to improperly formed condition: when we can expand
build_vector and when we can't. We support widening of vectors which are
half the size of the output registers.
In TOT you can found the check:
if (VecIn1.getValueType().getSizeInBits()*2 != VT.getSizeInBits())
return SDValue()
I think, it is wrong, since we should check number of elements instead
of bit width. This check doesn't take into account input and output
vectors number. It also takes into account bit width, but it doesn't
matter: if we produce illegal operation (input width != output width) it
will be legalized later.
Bug #2. Fired when LegalizeDAG tries to produce extract_vector_elt with
operands that are build_vector nodes. getNode tries to return
build_vector's operand in that case, but it also inserts the nodes, that
performs high bits cleaning from garbage:
Node0: i32 some-node-here
Node1: v2i16 build_vector Node0, Node0
// Current nodes sequence inserted when we invoke
// getNode(EXTRACT_VECTOR_ELT, dl, i32, Node0, idx=0)
Node1: i16 truncate Node0
Node2: i32 any_extend Node1
// return Node2 instead of extract_vector_elt node.
I'm 99,9(9)% sure, that it is wrong way, due to next reasons:
1. 16 may be illegal type (even if v2i16 is legal), so, at least you
need to check CombineLevel, it should be BeforeLegalizeTypes. In
another case you will crash the LegalizeDAG.
2. What if initially elements were i8 and then was transformed since
target is 32bit machine?
Node0: i8 some-node-here
Node1: v2i8 build_vector Node0, NodeItem1, NodeItem2, NodeItem3
After type legalization we got current state:
Node0: i32 some-node-here
Node1: v2i16 build_vector Node0, NodeItem1, NodeItem2, NodeItem3
But you can't determine initial type (was it 8bit or may be 4bit?) on
this stage. So 'i16 = truncate Node0' will not guarantee high bits
cleanness.
3. build_vector operands were already extended to
machine word width (32 bit in our case) with some way. We needn't do
the same job twice.
Please find the fix in attachment.
-Stepan.
Eli Friedman wrote:
> On Wed, Aug 15, 2012 at 2:55 AM, Stepan Dyatkovskiy <stpworld at narod.ru> wrote:
>> Hi Eli,
>> Please review the patch with bitcasting after concat_vectors.
>
> Looks fine.
>
> -Eli
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shuffle-vector-3.patch
Type: text/x-patch
Size: 3377 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120826/74250038/attachment.bin>
More information about the llvm-commits
mailing list