[PATCH] [X86][SSE] Keep 4i32 vector insertions in integer domain on pre-SSE4.1 targets

Sun Dec 7 06:34:16 PST 2014

On Thu, Dec 4, 2014 at 7:28 PM, Simon Pilgrim <llvm-dev at redking.me.uk>
wrote:

> chandlerc wrote:
> > I think an even better pattern is: movq, pshufd 0,2,2,2?
> >
> > Also, do we correctly match to movd when the source is a foldable load?
> I can't remember if there is a test case for that, but its really important
> to not do a shuffle when just loading a single i32 from memory into an xmm
> register.
> Yup - that'd be a nicer pattern (single register!) - easy enough to change.
>

Looking at this today, I feel like I must be missing something... or I must
have really been missing something earlier.

Why don't we lower this as pand with a constant mask? The load isn't going
to cost more in any real world cases, right?

> There is an existing movd folded load pattern using VMOVDI2PDIrm - I
> haven't seen any tests for it but it does seem to work alright.

It'd be really nice to add tests for that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141207/b8510773/attachment.html>