[PATCH] [X86][SSE] Avoid shuffles with zero by using pshufb to create zeros
Simon Pilgrim
llvm-dev at redking.me.uk
Fri Jan 9 14:15:49 PST 2015
Thanks Quentin.
A basic timing test of the pshufb vs 2xpshufb+por core loop gave a 30% improvement on my older Core2Duo machine (I guess due to throughput limitations), but this diminished to less than 5% on SandyBridge. However, its main use is the reduction in register pressure, as well as the obvious fact that it was pointlessly shuffling zero vectors.
REPOSITORY
rL LLVM
http://reviews.llvm.org/D6878
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list