[PATCH] Improve DAG combine pass on certain IR vector patterns

Quentin Colombet qcolombet at apple.com
Fri Jan 16 16:03:44 PST 2015


On Jan 16, 2015, at 3:46 PM, Chandler Carruth <chandlerc at google.com> wrote:

> 
> On Fri, Jan 16, 2015 at 3:44 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>> On Jan 16, 2015, at 3:04 PM, Chandler Carruth <chandlerc at google.com> wrote:
>> 
>> 
>> On Fri, Jan 16, 2015 at 2:40 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>> Well, that may be the conclusion: The performance impact may be within the noise.
>> Since this kind of patterns are very specific, this is not surprising.
>> For the record, I tend to ignore the tests that run for less than 1 second (too noisy). Then, the noise level is usually around 1% on a quiet computer with fixed frequency, which is not too bad. 
>> 
>> Numbers would mostly be nice because I don't know if other targets have the thing that makes this such a huge win on x86 -- implicit concat with undef to form 2x-wide vectors.
>> 
>> This may be an x86-specific win, in which case it should just be added as a target-specific combine.
> 
> 
> We are filling half of a vector (v8), and we have to choose between concat two v4 to v8 and then shuffle, or shuffle in v4 and then concat with undef to have a v8.
> I’m not sure I necessarily see the “target specific part” here, isn’t it a sort of canonicalization in the DAG? 
> 
> I mean, I said "may". =]
> 
> It is possible that a target has very anemic support for v4 shuffles, but excellent support for v8 shuffles, and it is always better to promote to v8 early, and shuffle there. Seems unlikely, but possible.
> 
> Anyways, I'm fine if we're confident all the targets want it to work that way. Makes it easier. But we should check first, either by looking at their generated code and making sure it's good, and/or by running benchmarks.
> 
> I'll let Quentin decide if there are specific benchmarks he things need to be run before this goes in; I think the code as-is LGTM.

I’d say if the idea I previously through for the small ARM test case (v2i16, v4i16, and v8i16) does not give anything, then LGTM too!

Just one thing, I guess this is the fix for PR21943. Please mention that in the commit message as well.
Finally, I think the radar you are actually fixing is <rdar://problem/19287012>, not <rdar://problem/13326338> ;).

Thanks for fixing this by the way.

Q.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150116/d9dc13d6/attachment.html>


More information about the llvm-commits mailing list