[PATCH] Inefficient code generation for 128-bit->256-bit typecast intrinsics (BZ #15712)

Thu Jul 18 18:11:09 PDT 2013

Would __builtin_shufflevector(__a, __a, 0, 1, -1, -1)  work?

On Thu, Jul 18, 2013 at 5:42 PM, Chandler Carruth <chandlerc at google.com>wrote:

>
> On Thu, Jul 18, 2013 at 5:32 PM, Katya Romanova <
> Katya_Romanova at playstation.sony.com> wrote:
>
>> -  __m128d __zero = _mm_setzero_pd();
>> -  return __builtin_shufflevector(__a, __zero, 0, 1, 2, 2);
>> +  return (__m256d)__builtin_ia32_pd256_pd((__v2df)__a);
>>
>
> I think this is the wrong approach.
>
> Rather than switching these to use an x86-specific builtin, instead it
> would be better to provide some generic form to produce an undef input to a
> shufflevector. That is a generally useful and completely target independent
> concept.
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>

-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130718/80015673/attachment.html>