[LLVMdev] Representing -ffast-math at the IR level

Mon Apr 16 10:40:41 PDT 2012

Hi Owen,

> I have some issues with representing this as a single "fast" mode flag,

it isn't a single flag, that's the whole point of using metadata.  OK, right
now there is only one option (the "accuracy"), true, but the intent is that
others will be added, and the meaning of accuracy tightened, later.  MDBuilder
has a createFastFPMath method which is intended to produce settings that match
GCC's -ffast-math, however frontends will be able to specify whatever settings
they like if that doesn't suit them (i.e. createFPMath will get more arguments
as more settings become available).

Note that as the current option isn't actually connected to any optimizations,
there is nothing much to argue about for the moment.

My plan is to introduce a few simple optimizations (x + 0.0 -> x for example)
that introduce a finite number of ULPs of error, and hook them up.  Thus this
does not include things like x * 0.0 -> 0.0 (infinite ULPs of error),
reassociation (infinite ULPs of error) or any other scary things.

  which mostly boil down to the fact that this is a very C-centric view of the 
world.  And, since C compilers are not generally known for their awesomeness on 
issues of numerics,  I'm not sure that's a good idea.
> Having something called a "fast" or "relaxed" mode implies that it is less precise than whatever the standard mode is.  However, C is notably sparse in specifying what exactly the standard mode is.  The typical assumption is that it is the strict one-to-one translation to IEEE754 semantics, but no optimizing C compiler actually implements that.

I think this is a misunderstanding of where I'm going, see above.

> Other languages are more interesting in this regard.  Fortran, for instance, allows reassociation within parentheses.  (Can that even be represented with instruction metadata?)

I'm aware of Fortran parentheses (PAREN_EXPR in gcc).  If it can't be expressed
well then too bad: reassociation can just be turned off and we won't optimize
Fortran as well as we could.  (As mentioned above I have no intention of turning
on reassociation based on the current flag since it can introduce an unbounded
number of ULPs of error).

   OpenCL has a very fairly baseline mode, but specifies a number of specific 
options the user can enable to relax it (-cl-mad-enable, -cl-no-signed-zeros, 
-cl-unsafe-math-optimization (implies the previous two), -cl-finite-math-only, 
-cl-fast-relaxed-math (implies all prior)).  GLSL has distinct desktop and 
embedded specifications that place different levels of constraint on 
implementations.

Yup.

>
> If we define the baseline behavior to be strict IEEE conformance,

Which we do.

  and then don't provide a more nuanced method of relaxing it,

Allowing more nuanced ways is the reason for using metadata as explained above.

  we're not going to be in a significantly better world than we are today.  No 
reasonable implementation of these languages wants strict conformance (except 
maybe desktop-profile OpenCL) as their default mode,

Strict conformance is what they get right now.

  nor is there any way a universal definition of "fast" math can work for all of 
them.

I agree, and I'm not trying to provide one.

Ciao, Duncan.

>
> --Owen
>
> On Apr 14, 2012, at 11:28 AM, Duncan Sands<baldrick at free.fr>  wrote:
>
>> The attached patch is a first attempt at representing "-ffast-math" at the IR
>> level, in fact on individual floating point instructions (fadd, fsub etc).  It
>> is done using metadata.  We already have a "fpmath" metadata type which can be
>> used to signal that reduced precision is OK for a floating point operation, eg
>>
>>     %z = fmul float %x, %y, !fpmath !0
>>   ...
>>   !0 = metadata !{double 2.5}
>>
>> indicates that the multiplication can be done in any way that doesn't introduce
>> more than 2.5 ULPs of error.
>>
>> The first observation is that !fpmath can be extended with additional operands
>> in the future: operands that say things like whether it is OK to assume that
>> there are no NaNs and so forth.
>>
>> This patch doesn't add additional operands though.  It just allows the existing
>> accuracy operand to be the special keyword "fast" instead of a number:
>>
>>     %z = fmul float %x, %y, !fpmath !0
>>   ...
>>   !0 = metadata !{!metadata "fast"}
>>
>> This indicates that accuracy loss is acceptable (just how much is unspecified)
>> for the sake of speed.  Thanks to Chandler for pushing me to do it this way!
>>
>> It also creates a simple way of getting and setting this information: the
>> FPMathOperator class: you can cast appropriate instructions to this class
>> and then use the querying/mutating methods to get/set the accuracy, whether
>> 2.5 or "fast".  The attached clang patch uses this to set the openCL 2.5 ULPs
>> accuracy rather than doing it by hand for example.
>>
>> In addition it changes IRBuilder so that you can provide an accuracy when
>> creating floating point operations.  I don't like this so much.  It would
>> be more efficient to just create the metadata once and then splat it onto
>> each instruction.  Also, if fpmath gets a bunch more options/operands in
>> the future then this interface will become more and more awkward.  Opinions
>> welcome!
>>
>> I didn't actually implement any optimizations that use this yet.
>>
>> I took a look at the impact on aermod.f90, a reasonably floating point heavy
>> Fortran benchmark (4% of the human readable IR consists of floating point
>> operations).  At -O3 (the worst), the size of the bitcode increases by 0.8%.
>> No idea if that's acceptable - hopefully it is!
>>
>> Enjoy!
>>
>> Duncan.
>> <fastm-llvm.diff><fastm-clang.diff>_______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>