[PATCH][X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.
Jim Grosbach
grosbach at apple.com
Fri Jun 20 19:12:39 PDT 2014
Cool. Thanks!
> On Jun 20, 2014, at 6:39 PM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
> Committed revision 211427.
> Thanks.
>
> On Fri, Jun 20, 2014 at 10:34 PM, Andrea Di Biagio
> <andrea.dibiagio at gmail.com> wrote:
>> Hi Jim,
>> Thanks for the review!
>>
>>> On Fri, Jun 20, 2014 at 9:35 PM, Jim Grosbach <grosbach at apple.com> wrote:
>>> This is excellent!
>>>
>>> Can you add a brief comment to the patch explaining the i32 constants in the input patterns and how they map to the masks?
>>>
>>> LGTM w/ that addition.
>>
>> Good point. I will add extra comments to explain the meaning of each
>> constant used as a shuffle mask.
>>
>> -Andrea
>>
>>>
>>> -Jim
>>>
>>>
>>>> On Jun 17, 2014, at 7:41 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> this patch adds ISel patterns to select SSE3/AVX ADDSUB instructions
>>>> from a sequence of "vadd + vsub + blend".
>>>>
>>>> Example:
>>>>
>>>> ///
>>>> typedef float float4 __attribute__((ext_vector_type(4)));
>>>>
>>>> float4 foo(float4 A, float4 B) {
>>>> float4 X = A - B;
>>>> float4 Y = A + B;
>>>> return (float4){X[0], Y[1], X[2], Y[3]};
>>>> }
>>>> ///
>>>>
>>>> Before this patch, (with flag -mcpu=corei7) we produced the following
>>>> assembly sequence:
>>>> movaps %xmm0, %xmm2
>>>> addps %xmm1, %xmm2
>>>> subps %xmm1, %xmm0
>>>> blendps $10, %xmm2, %xmm0
>>>>
>>>>
>>>> With this patch, we now produce a single
>>>> addsubps %xmm1, %xmm0
>>>>
>>>> Please let me know if ok to commit.
>>>>
>>>> Thanks,
>>>> Andrea Di Biagio.
>>>> <patch-addsub.diff>
>>>
More information about the llvm-commits
mailing list