[llvm-commits] [PATCH] Fix assertion failure with shufflevectors on ARM

Sun Aug 26 11:59:21 PDT 2012

Hi all!
Due to Duncan remarks for r162195:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120820/148595.html
I reverted this patch and proceed debugging the problem. Now it seems 
that I found proper places for fix.

I found that crashes, described by James in initial post are happened 
due to two bugs.

Bug #1. We insert wrong concat_vectors operation while expanding 
build_vector in DAGCombiner::visitBUILD_VECTOR:
v16i8 concat_vectors v4i16, v4i16
That happens due to improperly formed condition: when we can expand 
build_vector and when we can't. We support widening of vectors which are 
half the size of the output registers.
In TOT you can found the check:
if (VecIn1.getValueType().getSizeInBits()*2 != VT.getSizeInBits())
   return SDValue()
I think, it is wrong, since we should check number of elements instead 
of bit width. This check doesn't take into account input and output 
vectors number. It also takes into account bit width, but it doesn't 
matter: if we produce illegal operation (input width != output width) it 
will be legalized later.

Bug #2. Fired when LegalizeDAG tries to produce extract_vector_elt with 
operands that are build_vector nodes. getNode tries to return 
build_vector's operand in that case, but it also inserts the nodes, that 
performs high bits cleaning from garbage:
Node0: i32 some-node-here
Node1: v2i16 build_vector Node0, Node0

// Current nodes sequence inserted when we invoke
// getNode(EXTRACT_VECTOR_ELT, dl, i32, Node0, idx=0)
Node1: i16 truncate Node0
Node2: i32 any_extend Node1
// return Node2 instead of extract_vector_elt node.

I'm 99,9(9)% sure, that it is wrong way, due to next reasons:
  1. 16 may be illegal type (even if v2i16 is legal), so, at least you 
need to check  CombineLevel, it should be BeforeLegalizeTypes. In 
another case you will crash the LegalizeDAG.
  2. What if initially elements were i8 and then was transformed since 
target is 32bit machine?
  Node0: i8 some-node-here
  Node1: v2i8 build_vector Node0, NodeItem1, NodeItem2, NodeItem3
  After type legalization we got current state:
  Node0: i32 some-node-here
  Node1: v2i16 build_vector Node0, NodeItem1, NodeItem2, NodeItem3
  But you can't determine initial type (was it 8bit or may be 4bit?) on 
this stage. So 'i16 = truncate Node0' will not guarantee high bits 
cleanness.
  3. build_vector operands were already extended to
  machine word width (32 bit in our case) with some way. We needn't do
  the same job twice.

Please find the fix in attachment.

-Stepan.

Eli Friedman wrote:
> On Wed, Aug 15, 2012 at 2:55 AM, Stepan Dyatkovskiy <stpworld at narod.ru> wrote:
>> Hi Eli,
>> Please review the patch with bitcasting after concat_vectors.
>
> Looks fine.
>
> -Eli
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: shuffle-vector-3.patch
Type: text/x-patch
Size: 3377 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120826/74250038/attachment.bin>