[cfe-commits] [PATCH][ARM] Vext Lowering was missing opportunities

Quentin Colombet qcolombet at apple.com
Wed Oct 31 18:27:25 PDT 2012


Hi,

When working with ARM neon intrinsics, I have encountered a hole in how vext is mapped into the assembly.
Basically, something like this:

vext a, b, imm

is perfectly lowered, whereas something like this:

vext a, a, imm

generates more or less bad code (depending on the type of a, 128bits or 64bits) and definitely not the expected vext instruction.

The short story (long story at the very end of the mail for people interested in) is that the attached patch fix that and you can find the new test cases in the patch.

Cheers,

Quentin

----------
The long story

ARM doc: vext instruction extracts elements from the bottom end of the second operand vector and the top end of the first, concatenates them and places the result in the destination vector.

Now, the problem when writing something like:

vext a, a, imm

clang translates that into a shufflevector instruction with a sequence of integer (I simplify a bit) representing in which order each element of both operands should appear in the result vector.
Assuming 'a' has 8 elements, they would be numbered from 0 to 7 for the first operand and from 8 to 15 for the second operand.
For this kind of vector, the sequence of integer has the following pattern: the i+1th element equals the ith+1 (e.g. 2, 3, 4, 5). 
This is the pattern that is matched in ARMISelLowering.

However, when both operands are the same, an instruction combine optimization (visitShuffleVectorInst) during the late emit pass breaks this pattern.
It transforms  a, a into a, undef and updates the sequence of integer accordingly i.e. all integers point to the first operand (e.g. 2, 3, 4, 5 => 2, 3, 0, 1).
This pattern was not recognized by as VEXT.

Note that vrev and vext (with an undef argument) are equivalent for some patterns. Thus, I placed the new pattern matching after vrev matching instead of directly after vext with 2 arguments to not change exiting output, in particular in vrev.ll tests.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ARMVextLowering.patch
Type: application/octet-stream
Size: 3554 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20121031/c120ee24/attachment.obj>


More information about the cfe-commits mailing list