[llvm-commits] Please review - Shuffle optimization for AVX2 (broadcast)
Demikhovsky, Elena
elena.demikhovsky at intel.com
Wed Jun 27 01:35:18 PDT 2012
This is one more shuffle optimization that is actual for IR like
%b = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> zeroinitializer
which is widely used.
And this is the benefit:
Before:
vinsertf128 $1, %xmm0, %ymm0, %ymm0
vpermilps $0, %ymm0, %ymm0 # ymm0 = ymm0[0,0,0,0,4,4,4,4]
After:
vbroadcastss %xmm0, %ymm0
----------------
Before:
vpunpcklwd %xmm0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,1,1,2,2,3,3]
vinserti128 $1, %xmm0, %ymm0, %ymm0
vpermilps $0, %ymm0, %ymm0 # ymm0 = ymm0[0,0,0,0,4,4,4,4]
After:
vpbroadcastw %xmm0, %ymm0
Thank you.
- Elena
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: avx_opt3.diff
Type: application/octet-stream
Size: 4612 bytes
Desc: avx_opt3.diff
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120627/be4a8d07/attachment.obj>
More information about the llvm-commits
mailing list