[PATCH] Improve DAG combine pass on certain IR vector patterns
chandlerc at google.com
Fri Jan 16 15:46:51 PST 2015
On Fri, Jan 16, 2015 at 3:44 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
> On Jan 16, 2015, at 3:04 PM, Chandler Carruth <chandlerc at google.com>
> On Fri, Jan 16, 2015 at 2:40 PM, Quentin Colombet <qcolombet at apple.com>
>> Well, that may be the conclusion: The performance impact may be within
>> the noise.
>> Since this kind of patterns are very specific, this is not surprising.
>> For the record, I tend to ignore the tests that run for less than 1
>> second (too noisy). Then, the noise level is usually around 1% on a quiet
>> computer with fixed frequency, which is not too bad.
> Numbers would mostly be nice because I don't know if other targets have
> the thing that makes this such a huge win on x86 -- implicit concat with
> undef to form 2x-wide vectors.
> This may be an x86-specific win, in which case it should just be added as
> a target-specific combine.
> We are filling half of a vector (v8), and we have to choose between concat
> two v4 to v8 and then shuffle, or shuffle in v4 and then concat with undef
> to have a v8.
> I’m not sure I necessarily see the “target specific part” here, isn’t it a
> sort of canonicalization in the DAG?
I mean, I said "may". =]
It is possible that a target has very anemic support for v4 shuffles, but
excellent support for v8 shuffles, and it is always better to promote to v8
early, and shuffle there. Seems unlikely, but possible.
Anyways, I'm fine if we're confident all the targets want it to work that
way. Makes it easier. But we should check first, either by looking at their
generated code and making sure it's good, and/or by running benchmarks.
I'll let Quentin decide if there are specific benchmarks he things need to
be run before this goes in; I think the code as-is LGTM.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits