[PATCH] Lower _mm256_broadcastsi128_si256 to a vector shuffle

Andrea Di Biagio andrea.dibiagio at gmail.com
Tue Mar 3 03:46:30 PST 2015


Hi Juergen,

Your patch looks OK but I think that a better approach would be to
change the definition of _mm256_broadcastsi128_si256 in avx2intrin.h
to something like:

////
 static __inline__ __m256i __attribute__((__always_inline__, __nodebug__))
 _mm256_broadcastsi128_si256(__m128i __X)
 {
  return (__m256i) __builtin_shufflevector( __X, __X, 0, 1, 0, 1);
 }
////

If you change the definition of '_mm256_broadcastsi128_si256' in
avx2intrin.h, then you don't need to change 'CGBuiltin.cpp'.
This would still allow you to get rid of the gcc builtin on a later patch.

Thanks,
Andrea

On Mon, Mar 2, 2015 at 10:57 PM, Juergen Ributzka <juergen at apple.com> wrote:
> Hi @ll,
>
> this little patch lowers the AVX2 intrinsic _mm256_broadcastsi128_si256 (which calls __builtin_ia32_vbroadcastsi256) to a vector shuffle instead of another LLVM intrinsic (llvm.x86.avx2.vbroadcasti128).
>
> This change would allow LLVM to generate better code and we could remove the LLVM intrinsic, because it isn’t used anymore.
>
> Cheers,
> Juergen
>
>
>
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>




More information about the cfe-commits mailing list