[llvm-commits] Please review - Shuffle optimization for AVX2 (broadcast)

Wed Jun 27 01:35:18 PDT 2012

This is one more shuffle optimization that is actual for IR like
%b = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> zeroinitializer

which is widely used.

And this is the benefit:
Before:
        vinsertf128     $1, %xmm0, %ymm0, %ymm0
        vpermilps       $0, %ymm0, %ymm0 # ymm0 = ymm0[0,0,0,0,4,4,4,4]
After:
       vbroadcastss    %xmm0, %ymm0

----------------
Before:
        vpunpcklwd      %xmm0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,1,1,2,2,3,3]
        vinserti128     $1, %xmm0, %ymm0, %ymm0
        vpermilps       $0, %ymm0, %ymm0 # ymm0 = ymm0[0,0,0,0,4,4,4,4]
After:
      vpbroadcastw    %xmm0, %ymm0

Thank you.

- Elena

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: avx_opt3.diff
Type: application/octet-stream
Size: 4612 bytes
Desc: avx_opt3.diff
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120627/be4a8d07/attachment.obj>