[llvm-commits] Please review - Shuffle optimization for AVX2 (broadcast)

Demikhovsky, Elena elena.demikhovsky at intel.com
Wed Jun 27 01:35:18 PDT 2012


This is one more shuffle optimization that is actual for IR like
%b = shufflevector <8 x float> %a, <8 x float> undef, <8 x i32> zeroinitializer

which is widely used.

And this is the benefit:
Before:
        vinsertf128     $1, %xmm0, %ymm0, %ymm0
        vpermilps       $0, %ymm0, %ymm0 # ymm0 = ymm0[0,0,0,0,4,4,4,4]
After:
       vbroadcastss    %xmm0, %ymm0

----------------
Before:
        vpunpcklwd      %xmm0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,1,1,2,2,3,3]
        vinserti128     $1, %xmm0, %ymm0, %ymm0
        vpermilps       $0, %ymm0, %ymm0 # ymm0 = ymm0[0,0,0,0,4,4,4,4]
After:
      vpbroadcastw    %xmm0, %ymm0


Thank you.

- Elena


---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: avx_opt3.diff
Type: application/octet-stream
Size: 4612 bytes
Desc: avx_opt3.diff
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120627/be4a8d07/attachment.obj>


More information about the llvm-commits mailing list