[PATCH] [X86][SSE] Keep 4i32 vector insertions in integer domain on pre-SSE4.1 targets

Thu Dec 4 15:53:40 PST 2014

Fixed typo in shuffle immediate.

Regarding the cost/benefit of domain switches/data bypass delays - Chandler has covered it quite thoroughly. I'd like to add that this is for pre-SSE4.1 targets, older hardware that tend to see greater losses in these situations; that these shuffles are often performed between operations that must be done in the integer domain (i.e. domain switches before/after the float shuffle); and this particular lowering can now be done with a single register (my personal favourite as I'm on a never-ending crusade to get rid of spills!). On top of all that, staying within a domain is what is recommended in both Intel & AMD optimization guides (old and new) and Agner goes on about it to some length in his docs too.

http://reviews.llvm.org/D6526

Files:
  lib/Target/X86/X86InstrSSE.td
  test/CodeGen/X86/uint_to_fp-2.ll
  test/CodeGen/X86/vector-shuffle-128-v4.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6526.16959.patch
Type: text/x-patch
Size: 5393 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141204/9cc2732e/attachment.bin>