[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

Wed Jan 25 07:38:12 PST 2012

> I should point out that the C standard defines the FP_CONTRACT pragma for
exactly this purpose (7.12.2).  Off the top of my head, I'm not sure what
other languages have to say on the subject.

Incidentally we've been looking into how to support that pragma, and there
is currently no way to control LLVM's behaviour on FMA on a per-block basis,
just per-module.

Perhaps an attribute that could be applied to fmuls, fadds and fsubs might
be an interesting option?

This all depends on just how important this pragma and being "fully
standards compliant" in general is. GCC doesn't support it, for example
(although defaults to no FMA unless -fast-math is specified).

James

-----Original Message-----
From: llvm-commits-bounces at cs.uiuc.edu
[mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Stephen Canon
Sent: 25 January 2012 15:27
To: Hal Finkel
Cc: llvm-commits at cs.uiuc.edu; Anton Korobeynikov
Subject: Re: [llvm-commits] LLVM patch to support ARM fused multiply
add/subtract instructions

On Jan 25, 2012, at 10:07 AM, Hal Finkel wrote:

> On Wed, 2012-01-25 at 17:42 +0400, Anton Korobeynikov wrote:
>> Hi Ana,
>> 
>>> In this update:
>>> - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
>>> - I kept setting .fpu=neon-vfpv4 code attribute because that is what the
>>> assembler understands.
>> Looks ok.
>> 
>>> The additional changes mentioned in the email discussions I think belong
to
>>> a separate patch:
>>> - Associate VMLA/VMLS with LessPreciseFPMAD flag, and maybe with
fast-math
>>> flag.
>> They should definitely not be. They are not less precise! They are
>> "exactly precise" as two separate ops. It's just FMA which has greater
>> precision than usual thanks to 1 rounding.
>> And it's FMA which needs to be associated with -ffast-math on VFPv2
> 
> Just to be clear, are you advocating associating this with UnsafeFPMath
> or with !NoExcessFPPrecision? I think that it should be the latter, as
> that is what the PPC backend does (and that seems to match the intent of
> the TargetOptions API authors), but unlike -ffast-math
> (-enable-unsafe-fp-math), this will cause the patterns to be enabled by
> default.

Controlling contracting a*b + c to fma(a,b,c) is a thorny issue.  Such
contractions often give more accurate results, but they can also sabotage
certain important calculations.  As an example, consider squaring a complex
number:

	double complex z = CMPLX(M_PI, M_PI);
	double complex w = z*z;

Let's call the real and imaginary parts of z x and y, respectively.  Then
the real part of w is given by:

	double real_w = x*x - y*y;

If evaluated without contraction, x*x and y*y are both rounded to the same
value, so the subtraction cancels exactly and produces the correct result.
If contraction is used, then we get something like:

	double real_w = fma(x, x, -y*y);

Since no rounding occurs on the intermediate product x*x, the result is not
exactly zero, but is instead the low 53 bits of the exact product.  This
sort of effect can introduce nasty asymmetries into certain calculations.
It's fine for them to be enabled by default, but it should be possible to
toggle them independent of other numerical controls.  !NoExcessFPPrecision
is pretty close to the right idea.  -ffast-math seems wrong.

I should point out that the C standard defines the FP_CONTRACT pragma for
exactly this purpose (7.12.2).  Off the top of my head, I'm not sure what
other languages have to say on the subject.

- Steve
_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits