[PATCH] Improve DAG combine pass on certain IR vector patterns
chandlerc at google.com
Fri Jan 16 15:12:14 PST 2015
On Fri, Jan 16, 2015 at 3:11 PM, Fiona Glaser <fglaser at apple.com> wrote:
> On Jan 16, 2015, at 3:04 PM, Chandler Carruth <chandlerc at google.com>
> On Fri, Jan 16, 2015 at 2:40 PM, Quentin Colombet <qcolombet at apple.com>
>> Well, that may be the conclusion: The performance impact may be within
>> the noise.
>> Since this kind of patterns are very specific, this is not surprising.
>> For the record, I tend to ignore the tests that run for less than 1
>> second (too noisy). Then, the noise level is usually around 1% on a quiet
>> computer with fixed frequency, which is not too bad.
> Numbers would mostly be nice because I don't know if other targets have
> the thing that makes this such a huge win on x86 -- implicit concat with
> undef to form 2x-wide vectors.
> This may be an x86-specific win, in which case it should just be added as
> a target-specific combine.
> Isn’t that typical of SIMD architectures in general? That is, if an arch
> supports both N and 2N vector sizes, an operation on size-N vectors
> typically clears the top half, right? Or on armv7-like architectures you
> can modify d0 and then address q0, right? I’m not super familiar with any
> architectures other than ARMv7 NEON and SSE/AVX that support multiple
> native sizes though, so correct me if I’m wrong!
> I guess the worst case would be something like this:
> old pseudocode:
> concat xmm2, xmm0, xmm1
> shuffle ymm3, ymm2
> new pseudocode:
> shuffle xmm2, xmm0, xmm1
> concat xmm2, xmm3
> If the implicit concat isn’t there, and the architecture has no benefit to
> using smaller shuffles, and the architecture has no two-source shuffle for
> reasonable element sizes, I guess it could end up with an extra op?
I mean, I agree with everything you say, but I know next to nothing about
either ARM or PPC's SIMD behavior, so I try not to assume. ;]
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits