[PATCH][X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.

Andrea Di Biagio andrea.dibiagio at gmail.com
Fri Jun 20 14:34:39 PDT 2014


Hi Jim,
Thanks for the review!

On Fri, Jun 20, 2014 at 9:35 PM, Jim Grosbach <grosbach at apple.com> wrote:
> This is excellent!
>
> Can you add a brief comment to the patch explaining the i32 constants in the input patterns and how they map to the masks?
>
> LGTM w/ that addition.

Good point. I will add extra comments to explain the meaning of each
constant used as a shuffle mask.

-Andrea

>
> -Jim
>
>
>> On Jun 17, 2014, at 7:41 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>>
>> Hi,
>>
>> this patch adds ISel patterns to select SSE3/AVX ADDSUB instructions
>> from a sequence of "vadd + vsub + blend".
>>
>> Example:
>>
>> ///
>> typedef float float4 __attribute__((ext_vector_type(4)));
>>
>> float4 foo(float4 A, float4 B) {
>>  float4 X = A - B;
>>  float4 Y = A + B;
>>  return (float4){X[0], Y[1], X[2], Y[3]};
>> }
>> ///
>>
>> Before this patch, (with flag -mcpu=corei7) we produced the following
>> assembly sequence:
>>  movaps  %xmm0, %xmm2
>>  addps   %xmm1, %xmm2
>>  subps   %xmm1, %xmm0
>>  blendps $10, %xmm2, %xmm0
>>
>>
>> With this patch, we now produce a single
>>  addsubps  %xmm1, %xmm0
>>
>> Please let me know if ok to commit.
>>
>> Thanks,
>> Andrea Di Biagio.
>> <patch-addsub.diff>
>



More information about the llvm-commits mailing list