[PATCH][X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.
Jim Grosbach
grosbach at apple.com
Fri Jun 20 13:35:23 PDT 2014
This is excellent!
Can you add a brief comment to the patch explaining the i32 constants in the input patterns and how they map to the masks?
LGTM w/ that addition.
-Jim
> On Jun 17, 2014, at 7:41 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
> Hi,
>
> this patch adds ISel patterns to select SSE3/AVX ADDSUB instructions
> from a sequence of "vadd + vsub + blend".
>
> Example:
>
> ///
> typedef float float4 __attribute__((ext_vector_type(4)));
>
> float4 foo(float4 A, float4 B) {
> float4 X = A - B;
> float4 Y = A + B;
> return (float4){X[0], Y[1], X[2], Y[3]};
> }
> ///
>
> Before this patch, (with flag -mcpu=corei7) we produced the following
> assembly sequence:
> movaps %xmm0, %xmm2
> addps %xmm1, %xmm2
> subps %xmm1, %xmm0
> blendps $10, %xmm2, %xmm0
>
>
> With this patch, we now produce a single
> addsubps %xmm1, %xmm0
>
> Please let me know if ok to commit.
>
> Thanks,
> Andrea Di Biagio.
> <patch-addsub.diff>
More information about the llvm-commits
mailing list