[PATCH] [X86][SSE] Keep 4i32 vector insertions in integer domain on pre-SSE4.1 targets

Simon Pilgrim llvm-dev at redking.me.uk
Sun Dec 7 06:45:47 PST 2014


So, I've done some simple loop timing tests (doing paddd's before + after the shuffle code to ensure we're using the integer domain) on the following older cpus:

Intel Core 2 Duo 1.83 GHz (T5600) Merom 
Intel Core 2 Duo 3.06 GHz (E7600) Wolfdale
Pentium M 1.60 GHz Deron

And after all that I'm seeing no discernable difference in performance between the 2 implementations - the movss version can be made faster if we don't have to generate the zero (e.g. if its already generated and this is the last use of the register) but that is it.

With that in mind I'm recommending that we do go ahead with this patch, primarily for the lower use of registers and that it matches the general rule of avoiding domain swaps - but don't expect any big improvement on old hardware!

http://reviews.llvm.org/D6526






More information about the llvm-commits mailing list