[PATCH][X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.

Fri Jun 20 19:12:39 PDT 2014

Cool. Thanks!

> On Jun 20, 2014, at 6:39 PM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> 
> Committed revision 211427.
> Thanks.
> 
> On Fri, Jun 20, 2014 at 10:34 PM, Andrea Di Biagio
> <andrea.dibiagio at gmail.com> wrote:
>> Hi Jim,
>> Thanks for the review!
>> 
>>> On Fri, Jun 20, 2014 at 9:35 PM, Jim Grosbach <grosbach at apple.com> wrote:
>>> This is excellent!
>>> 
>>> Can you add a brief comment to the patch explaining the i32 constants in the input patterns and how they map to the masks?
>>> 
>>> LGTM w/ that addition.
>> 
>> Good point. I will add extra comments to explain the meaning of each
>> constant used as a shuffle mask.
>> 
>> -Andrea
>> 
>>> 
>>> -Jim
>>> 
>>> 
>>>> On Jun 17, 2014, at 7:41 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> this patch adds ISel patterns to select SSE3/AVX ADDSUB instructions
>>>> from a sequence of "vadd + vsub + blend".
>>>> 
>>>> Example:
>>>> 
>>>> ///
>>>> typedef float float4 __attribute__((ext_vector_type(4)));
>>>> 
>>>> float4 foo(float4 A, float4 B) {
>>>> float4 X = A - B;
>>>> float4 Y = A + B;
>>>> return (float4){X[0], Y[1], X[2], Y[3]};
>>>> }
>>>> ///
>>>> 
>>>> Before this patch, (with flag -mcpu=corei7) we produced the following
>>>> assembly sequence:
>>>> movaps  %xmm0, %xmm2
>>>> addps   %xmm1, %xmm2
>>>> subps   %xmm1, %xmm0
>>>> blendps $10, %xmm2, %xmm0
>>>> 
>>>> 
>>>> With this patch, we now produce a single
>>>> addsubps  %xmm1, %xmm0
>>>> 
>>>> Please let me know if ok to commit.
>>>> 
>>>> Thanks,
>>>> Andrea Di Biagio.
>>>> <patch-addsub.diff>
>>>