[PATCH] [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.

Mon Jun 16 22:44:39 PDT 2014

Hi Tim,

I have to admit this patch is a little complex to understand. I try to summarize what I did in this patch from the view of what I want to solve.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4182-4183
@@ -4180,1 +4181,4 @@
       continue;
+    } else if (SourceVecs[i].getValueType().getVectorElementType() !=
+               VT.getVectorElementType()) {
+      if (SourceVecs[i].getOpcode() == ISD::AssertSext ||
----------------
This condition check is moved from below to here, because without this check, for dags like(This can be created from lowering 'tmp2 = fptosi <4 x double> %tmp1 to <4 x i16>'),

  A: v4i16 = build_vector B, C,D, E
    B: i32 = extract_vector_elt F, lane 0
      F: v2i32 = AssertSext v2i32, type i16
    C: i32 = extract_vector_elt F, lane 1
      F: v2i32 = AssertSext v2i32, type i16
    D: i32 = extract_vector_elt G, lane 0
      G: v2i32 = AssertSext v2i32, type i16
    E: i32 = extract_vector_elt G, lane 1
      G: v2i32 = AssertSext v2i32, type i16

That build_vector node will be lowering to a concat_vector combining 2 of v2i32 into v4i16, which is pattern match failed.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4184-4185
@@ +4183,4 @@
+               VT.getVectorElementType()) {
+      if (SourceVecs[i].getOpcode() == ISD::AssertSext ||
+          SourceVecs[i].getOpcode() == ISD::AssertZext) {
+        // For AssertSext/AssertZext, we need to bitcast it to the vector which
----------------
Tim Northover wrote:
> Isn't the key type here VT.getVectorElementType()? I'm not sure I see the logic of caring about any AssertSext/AssertZext input: we're just going to be discarding the bits anyway in your example.
> 
> In the reverse case (say extracting from v4i16 and inserting into v4i32), as far as I can see EXTRACT_VECTOR_ELT will basically be doing an anyext, so we don't have to care there either.
> 
> Or do you have examples where this kind of check is needed?
If input cannot pass element type check, we can simply get function return and the incorrect concat_vector would not be generated. But from the result, I saw lots of MOV and INS instructions are generated for this build_vector. To get better result, I added additional codes here specically for AssertSext /AssertZext, wishing to generate single UZIP1 instead. For AssertSext /AssertZext, only low bits of each element are valid, so for a v2i32 AssertSext node asserting holds i16, it can bitcast to v4i16, which lane 0, 2 are matching prior lanes and lane 1, 3 are undefined. So all lane numbers extracting on AssertSext node should adjust by multiplying 2.  After that, this kind of build_vector can reconstruct to shuffle_vector and get single UZIP2 at last.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4201-4202
@@ +4200,4 @@
+        SDValue BitCst = DAG.getNode(ISD::BITCAST, dl, AssertVT, SourceVecs[i]);
+        unsigned Pow = AssertVT.getVectorNumElements() /
+                       SourceVecs[i].getValueType().getVectorNumElements();
+        // Collect operands to create new BUILD_VECTOR node, lanes in extracting
----------------
Tim Northover wrote:
> This seems very analogous to the VExtOffsets vector, but is handled completely differently.
> 
> Instead of mangling the actual BUILD_VECTOR node, perhaps we should create a similar OffsetMultipliers (say) variable and just record what we've done for later in the function.
You mean rename Pow to OffsetMultipliers? I will do that in next version.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4225
@@ +4224,3 @@
+        MinElts[i] *= Pow;
+        if (SourceVecs[i].getValueType() == VT) {
+          // No VEXT necessary
----------------
Tim Northover wrote:
> If this entire section of added code is moved to the start of the "for (unsigned i = 0; i < SourceVecs.size(); ++i)" loop, this block becomes redundant, we can just use the existing one at line 4176.
Agreed.

http://reviews.llvm.org/D4080