[llvm-commits] Please review: FMA3 instructions set

Stephen Canon scanon at apple.com
Wed May 23 07:33:55 PDT 2012


Just to be a little more precise:

	When FP_CONTRACT is on (the default), Clang will form llvm.fmuladd intrinsics at frontend codegen time from mul/add pairs within an expression.
	When FP_CONTRACT is off, Clang will not form llvm.fmuladd intrinsics at all, and will instead form fmul + fadd).

	Legalizer will replace fmuladd with fma if the target says fma is fast, and with fmul + fadd otherwise.

This allows us to model the strict C language semantics for both settings of FP_CONTRACT, without the front end needing to have any new target-specific knowledge, and without the back end needing to know anything about FP_CONTRACT.

- Steve

On May 23, 2012, at 10:22 AM, Evan Cheng <evan.cheng at apple.com> wrote:

> cc'ing Lang. He is going to make the Clang change to form llvm.fmuladd intrinsics at frontend codegen time (from mul / add within an expression). The backend can then codegen these  to fma when FP_CONTRACT is on.
> 
> Evan
> 
> On May 23, 2012, at 6:54 AM, Stephen Canon <scanon at apple.com> wrote:
> 
>> Elena --
>> 
>> It's not quite that simple.
>> 
>> Even if we make [#pragma STDC FP_CONTRACT on] the default (which we should), that licenses FMA formation only *within an expression*.  i.e.:
>> 
>> 	double x,y,z;
>> 	double r = x*y + z; // can be contracted to fma(x,y,z)
>> 	double p = x*y;
>> 	double q = p + z; // cannot be contracted to fma(x,y,z), even if p is never used elsewhere.
>> 
>> We can (and should!) have a relaxed-fp flag that licenses FMA formation wherever the LLVM judges it to be a performance win, but that mode will not comply with C floating-point semantics (nor with those of several other languages).
>> 
>> - Steve
>> 
>> On May 23, 2012, at 3:13 AM, "Demikhovsky, Elena" <elena.demikhovsky at intel.com> wrote:
>> 
>>> By default, FMA should be switched on. The "#pragma FP_COTRACT off "deprecates FMA. But there is no direct link in the LLVM sources between the pragma and code generation options.
>>> 
>>> - Elena
>>> 
>>> -----Original Message-----
>>> From: Anton Korobeynikov [mailto:anton at korobeynikov.info] 
>>> Sent: Tuesday, May 22, 2012 19:53
>>> To: Stephen Canon
>>> Cc: Demikhovsky, Elena; llvm-commits at cs.uiuc.edu
>>> Subject: Re: [llvm-commits] Please review: FMA3 instructions set
>>> 
>>>> Do I understand correctly that this patch lowers fadd + fmul to fma by default?  We want it to be easy for LLVM to generate fma when it is beneficial to performance, but we can't simply naively lower to it everywhere and still conform to language semantics.  Someone else can speak to what policy should be here, but at the very least we will need to have an option to block fma formation.
>>> 
>>> One can check how fma is implemented on ARM. In general - stuff should be guarded by NoExcessPrecision flag or something like this
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 




More information about the llvm-commits mailing list