[PATCH] [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.
Kevin Qin
kevinqindev at gmail.com
Mon Jun 16 22:44:39 PDT 2014
Hi Tim,
I have to admit this patch is a little complex to understand. I try to summarize what I did in this patch from the view of what I want to solve.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4182-4183
@@ -4180,1 +4181,4 @@
continue;
+ } else if (SourceVecs[i].getValueType().getVectorElementType() !=
+ VT.getVectorElementType()) {
+ if (SourceVecs[i].getOpcode() == ISD::AssertSext ||
----------------
This condition check is moved from below to here, because without this check, for dags like(This can be created from lowering 'tmp2 = fptosi <4 x double> %tmp1 to <4 x i16>'),
A: v4i16 = build_vector B, C,D, E
B: i32 = extract_vector_elt F, lane 0
F: v2i32 = AssertSext v2i32, type i16
C: i32 = extract_vector_elt F, lane 1
F: v2i32 = AssertSext v2i32, type i16
D: i32 = extract_vector_elt G, lane 0
G: v2i32 = AssertSext v2i32, type i16
E: i32 = extract_vector_elt G, lane 1
G: v2i32 = AssertSext v2i32, type i16
That build_vector node will be lowering to a concat_vector combining 2 of v2i32 into v4i16, which is pattern match failed.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4184-4185
@@ +4183,4 @@
+ VT.getVectorElementType()) {
+ if (SourceVecs[i].getOpcode() == ISD::AssertSext ||
+ SourceVecs[i].getOpcode() == ISD::AssertZext) {
+ // For AssertSext/AssertZext, we need to bitcast it to the vector which
----------------
Tim Northover wrote:
> Isn't the key type here VT.getVectorElementType()? I'm not sure I see the logic of caring about any AssertSext/AssertZext input: we're just going to be discarding the bits anyway in your example.
>
> In the reverse case (say extracting from v4i16 and inserting into v4i32), as far as I can see EXTRACT_VECTOR_ELT will basically be doing an anyext, so we don't have to care there either.
>
> Or do you have examples where this kind of check is needed?
If input cannot pass element type check, we can simply get function return and the incorrect concat_vector would not be generated. But from the result, I saw lots of MOV and INS instructions are generated for this build_vector. To get better result, I added additional codes here specically for AssertSext /AssertZext, wishing to generate single UZIP1 instead. For AssertSext /AssertZext, only low bits of each element are valid, so for a v2i32 AssertSext node asserting holds i16, it can bitcast to v4i16, which lane 0, 2 are matching prior lanes and lane 1, 3 are undefined. So all lane numbers extracting on AssertSext node should adjust by multiplying 2. After that, this kind of build_vector can reconstruct to shuffle_vector and get single UZIP2 at last.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4201-4202
@@ +4200,4 @@
+ SDValue BitCst = DAG.getNode(ISD::BITCAST, dl, AssertVT, SourceVecs[i]);
+ unsigned Pow = AssertVT.getVectorNumElements() /
+ SourceVecs[i].getValueType().getVectorNumElements();
+ // Collect operands to create new BUILD_VECTOR node, lanes in extracting
----------------
Tim Northover wrote:
> This seems very analogous to the VExtOffsets vector, but is handled completely differently.
>
> Instead of mangling the actual BUILD_VECTOR node, perhaps we should create a similar OffsetMultipliers (say) variable and just record what we've done for later in the function.
You mean rename Pow to OffsetMultipliers? I will do that in next version.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4225
@@ +4224,3 @@
+ MinElts[i] *= Pow;
+ if (SourceVecs[i].getValueType() == VT) {
+ // No VEXT necessary
----------------
Tim Northover wrote:
> If this entire section of added code is moved to the start of the "for (unsigned i = 0; i < SourceVecs.size(); ++i)" loop, this block becomes redundant, we can just use the existing one at line 4176.
Agreed.
http://reviews.llvm.org/D4080
More information about the llvm-commits
mailing list