[PATCH] [X86][SSE] Keep 4i32 vector insertions in integer domain on SSE4.1 targets
Simon Pilgrim
llvm-dev at redking.me.uk
Sun Nov 30 13:37:19 PST 2014
Hi chandlerc, andreadb,
4i32 shuffles for single insertions into zero vectors lowers to X86vzmovl which was using (v)blendps - causing domain switch stalls. This patch fixes this by using (v)pblendw instead.
The updated tests on test/CodeGen/X86/sse41.ll still contain a domain stall due to the use of insertps - I'm looking at fixing this in a future patch.
Pre-SSE4.1 targets are still affected by a similar domain stall using movss - we could fix this by using 2 x ( punpckldq XMM, zero ) in series - if people agree I'll make a patch for this as well.
http://reviews.llvm.org/D6458
Files:
lib/Target/X86/X86InstrSSE.td
test/CodeGen/X86/combine-and.ll
test/CodeGen/X86/combine-or.ll
test/CodeGen/X86/sse41.ll
test/CodeGen/X86/vector-shuffle-128-v4.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6458.16753.patch
Type: text/x-patch
Size: 6691 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141130/bbf6d83a/attachment.bin>
More information about the llvm-commits
mailing list