[PATCH] Lower _mm256_broadcastsi128_si256 to a vector shuffle

Tue Mar 3 09:05:39 PST 2015

Hi Andrea,

that approach is great. I will do that instead.

Thanks

Cheers,
Juergen

> On Mar 3, 2015, at 3:46 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> 
> Hi Juergen,
> 
> Your patch looks OK but I think that a better approach would be to
> change the definition of _mm256_broadcastsi128_si256 in avx2intrin.h
> to something like:
> 
> ////
> static __inline__ __m256i __attribute__((__always_inline__, __nodebug__))
> _mm256_broadcastsi128_si256(__m128i __X)
> {
>  return (__m256i) __builtin_shufflevector( __X, __X, 0, 1, 0, 1);
> }
> ////
> 
> If you change the definition of '_mm256_broadcastsi128_si256' in
> avx2intrin.h, then you don't need to change 'CGBuiltin.cpp'.
> This would still allow you to get rid of the gcc builtin on a later patch.
> 
> Thanks,
> Andrea
> 
> On Mon, Mar 2, 2015 at 10:57 PM, Juergen Ributzka <juergen at apple.com> wrote:
>> Hi @ll,
>> 
>> this little patch lowers the AVX2 intrinsic _mm256_broadcastsi128_si256 (which calls __builtin_ia32_vbroadcastsi256) to a vector shuffle instead of another LLVM intrinsic (llvm.x86.avx2.vbroadcasti128).
>> 
>> This change would allow LLVM to generate better code and we could remove the LLVM intrinsic, because it isn’t used anymore.
>> 
>> Cheers,
>> Juergen
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>