[PATCH] [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.
Tim Northover
t.p.northover at gmail.com
Mon Jun 16 03:09:26 PDT 2014
Hi Kevin,
Sorry it's taken me so long to get around to this one; it's rather a complicated patch (even in comparison to the surrounding code) and I wanted to make sure I did it justice.
I've got a few questions (at the end, as with usual Phab stuff)...
Cheers.
Tim.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4184-4185
@@ +4183,4 @@
+ VT.getVectorElementType()) {
+ if (SourceVecs[i].getOpcode() == ISD::AssertSext ||
+ SourceVecs[i].getOpcode() == ISD::AssertZext) {
+ // For AssertSext/AssertZext, we need to bitcast it to the vector which
----------------
Isn't the key type here VT.getVectorElementType()? I'm not sure I see the logic of caring about any AssertSext/AssertZext input: we're just going to be discarding the bits anyway in your example.
In the reverse case (say extracting from v4i16 and inserting into v4i32), as far as I can see EXTRACT_VECTOR_ELT will basically be doing an anyext, so we don't have to care there either.
Or do you have examples where this kind of check is needed?
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4201-4202
@@ +4200,4 @@
+ SDValue BitCst = DAG.getNode(ISD::BITCAST, dl, AssertVT, SourceVecs[i]);
+ unsigned Pow = AssertVT.getVectorNumElements() /
+ SourceVecs[i].getValueType().getVectorNumElements();
+ // Collect operands to create new BUILD_VECTOR node, lanes in extracting
----------------
This seems very analogous to the VExtOffsets vector, but is handled completely differently.
Instead of mangling the actual BUILD_VECTOR node, perhaps we should create a similar OffsetMultipliers (say) variable and just record what we've done for later in the function.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:4225
@@ +4224,3 @@
+ MinElts[i] *= Pow;
+ if (SourceVecs[i].getValueType() == VT) {
+ // No VEXT necessary
----------------
If this entire section of added code is moved to the start of the "for (unsigned i = 0; i < SourceVecs.size(); ++i)" loop, this block becomes redundant, we can just use the existing one at line 4176.
http://reviews.llvm.org/D4080
More information about the llvm-commits
mailing list