[llvm] r211679 - [X86] Add target combine rule to select ADDSUB instructions from a build_vector

Hal Finkel hfinkel at anl.gov
Wed Jun 25 07:54:49 PDT 2014


----- Original Message -----
> From: "Andrea Di Biagio" <andrea.dibiagio at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Andrea Di Biagio" <Andrea_DiBiagio at sn.scee.net>, "llvm-commits at cs.uiuc.edu for LLVM" <llvm-commits at cs.uiuc.edu>
> Sent: Wednesday, June 25, 2014 6:57:11 AM
> Subject: Re: [llvm] r211679 - [X86] Add target combine rule to select ADDSUB instructions from a build_vector
> 
> Hi Hal,
> 
> I can confirm that after r211427 we correctly match a 'subadd' into a
> x86 ADDSUB instruction.
> To verify this I redirected to ' llc -mcpu=core-i7-avx' the output of
> test 'addsub.ll' added at revision r211339.

Good; thanks for looking at this...

> 
> So,
> 
> opt < addsub.ll -basicaa -slp-vectorizer -S | llc -march=x86-64
> -mcpu=corei7-avx | less
> 
> you can see how we correctly emit an 'addsub' instruction in the case
> of function @fsubfadd.
> We don't emit an 'addsub' for the other functions but that's fine for
> the following reasons:
>  1- 'addsub' on X86 is only for packed float vectors (therefore - we
> cannot match an 'addsub' in case of functions @addsub and @subadd);

Yes, this is a different issue.

>  2- the semantic of 'addsub' for x86 requires that even-numbered
> elements are subtracted (not added). That means, we cannot match an
> 'addsub' in function @faddfsub.

You're right, however, I think that we can still do this in combination with another instruction: subadd(x, y) == addsub(x, (-1, 1, -1, 1, ...)*y) [and the multiply can be lowered as an xor on the sign bits, IIRC, which is more efficient than actually using a floating-point multiply]. In intrinsics, this would be something like _mm256_xor_pd(y, _mm256_set_ps(-0.0f, -0.0f, ...)). Is this reasonable?

Thanks again,
Hal

> 
> Instead, we correctly match an 'addsub' in the case of function
> @fsubfadd.
> 
> On x86, excluding function @fsubfadd, other functions cannot be
> translated using 'addsubps/addsubpd' (and their AVX variants).
> 
> I hope this helps :-)
> 
> Andrea
> 
> On Wed, Jun 25, 2014 at 12:01 PM, Andrea Di Biagio
> <andrea.dibiagio at gmail.com> wrote:
> > Hi Hal,
> >
> >
> >
> > On Wed, Jun 25, 2014 at 11:35 AM, Hal Finkel <hfinkel at anl.gov>
> > wrote:
> >> Does this mean that we now match the form of these produced by the
> >> SLP vectorizer? (see r211339)
> >>
> >>  -Hal
> >>
> >
> > Interesting, apparently I missed that commit...
> > I'll have a look at it now.
> >
> > -Andrea
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list