[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

James Molloy James.Molloy at arm.com
Wed Jan 25 08:23:57 PST 2012


> >
> > Perhaps an attribute that could be applied to fmuls, fadds and fsubs might
> > be an interesting option?
>
> Do you mean something like the nsw/nuw flags on add, etc. that means
> "don't compute me at excess precision"?

Yes, that's exactly my proposal. I wasn't sure how well received it might be however, given that we like to avoid adding extra attributes if possible. I also didn't think anyone cared enough, but this conversation has changed my mind slightly...

-----Original Message-----
From: Hal Finkel [mailto:hfinkel at anl.gov]
Sent: 25 January 2012 16:18
To: James Molloy
Cc: 'Stephen Canon'; llvm-commits at cs.uiuc.edu; Anton Korobeynikov
Subject: RE: [llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

On Wed, 2012-01-25 at 15:38 +0000, James Molloy wrote:
> > I should point out that the C standard defines the FP_CONTRACT pragma for
> exactly this purpose (7.12.2).

This is an excellent point.

>  Off the top of my head, I'm not sure what
> other languages have to say on the subject.
>
> Incidentally we've been looking into how to support that pragma, and there
> is currently no way to control LLVM's behaviour on FMA on a per-block basis,
> just per-module.

Great!

>
> Perhaps an attribute that could be applied to fmuls, fadds and fsubs might
> be an interesting option?

Do you mean something like the nsw/nuw flags on add, etc. that means
"don't compute me at excess precision"?

>
> This all depends on just how important this pragma and being "fully
> standards compliant" in general is. GCC doesn't support it, for example
> (although defaults to no FMA unless -fast-math is specified).

GCC's lack of support for this is probably not something we should aim
to emulate ;)

 -Hal

>
> James
>
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Stephen Canon
> Sent: 25 January 2012 15:27
> To: Hal Finkel
> Cc: llvm-commits at cs.uiuc.edu; Anton Korobeynikov
> Subject: Re: [llvm-commits] LLVM patch to support ARM fused multiply
> add/subtract instructions
>
> On Jan 25, 2012, at 10:07 AM, Hal Finkel wrote:
>
> > On Wed, 2012-01-25 at 17:42 +0400, Anton Korobeynikov wrote:
> >> Hi Ana,
> >>
> >>> In this update:
> >>> - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
> >>> - I kept setting .fpu=neon-vfpv4 code attribute because that is what the
> >>> assembler understands.
> >> Looks ok.
> >>
> >>> The additional changes mentioned in the email discussions I think belong
> to
> >>> a separate patch:
> >>> - Associate VMLA/VMLS with LessPreciseFPMAD flag, and maybe with
> fast-math
> >>> flag.
> >> They should definitely not be. They are not less precise! They are
> >> "exactly precise" as two separate ops. It's just FMA which has greater
> >> precision than usual thanks to 1 rounding.
> >> And it's FMA which needs to be associated with -ffast-math on VFPv2
> >
> > Just to be clear, are you advocating associating this with UnsafeFPMath
> > or with !NoExcessFPPrecision? I think that it should be the latter, as
> > that is what the PPC backend does (and that seems to match the intent of
> > the TargetOptions API authors), but unlike -ffast-math
> > (-enable-unsafe-fp-math), this will cause the patterns to be enabled by
> > default.
>
> Controlling contracting a*b + c to fma(a,b,c) is a thorny issue.  Such
> contractions often give more accurate results, but they can also sabotage
> certain important calculations.  As an example, consider squaring a complex
> number:
>
>       double complex z = CMPLX(M_PI, M_PI);
>       double complex w = z*z;
>
> Let's call the real and imaginary parts of z x and y, respectively.  Then
> the real part of w is given by:
>
>       double real_w = x*x - y*y;
>
> If evaluated without contraction, x*x and y*y are both rounded to the same
> value, so the subtraction cancels exactly and produces the correct result.
> If contraction is used, then we get something like:
>
>       double real_w = fma(x, x, -y*y);
>
> Since no rounding occurs on the intermediate product x*x, the result is not
> exactly zero, but is instead the low 53 bits of the exact product.  This
> sort of effect can introduce nasty asymmetries into certain calculations.
> It's fine for them to be enabled by default, but it should be possible to
> toggle them independent of other numerical controls.  !NoExcessFPPrecision
> is pretty close to the right idea.  -ffast-math seems wrong.
>
> I should point out that the C standard defines the FP_CONTRACT pragma for
> exactly this purpose (7.12.2).  Off the top of my head, I'm not sure what
> other languages have to say on the subject.
>
> - Steve
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>

--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory



-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.




More information about the llvm-commits mailing list