[LLVMdev] Representing -ffast-math at the IR level

Sat Apr 14 13:34:29 PDT 2012

On Sat, Apr 14, 2012 at 11:44 PM, Duncan Sands <baldrick at free.fr> wrote:

>
> I think you have a step in the right direction, walking away from ULPs,
>> which
>> are pretty useless for the purpose of describing allowed fp optimizations
>> IMHO.
>> But using just "fast" keyword (or whatever else will be added in the
>> future) is
>> not enough without strict definition of this keyword in terms of IR
>> transformations. For example, particular transformation may be interested
>> if
>> reassociation is allowed or not ((a+b)+c=> a+(b+c)), if fp contraction is
>> allowed or not (ab+c = >fma(a,b,c)), if addition of zero may be canceled
>> (x+0=>x) and etc. If this definition is not given on infrastructure
>> level, this
>> may lead to disaster, when each transformation interprets "fast" in its
>> own way.
>>
>
> This is actually the main reason for using metadata rather than a flag
> like the
> "nsw" flag on integer operations: it is easily extendible with more info
> to say
> whether reassociation is OK and so forth.
>
> The kinds of transforms I think can reasonably be done with the current
> information are things like: x + 0.0 -> x; x / constant -> x * (1 /
> constant) if
> constant and 1 / constant are normal (and not denormal) numbers.
>

The particular definition is not that important, as the fact that this
definition exists :) I.e. I think we need a set of transformations to be
defined (as enum the most likely, as Renato pointed out) and an interface,
which accepts "fp-model" (which is "fast", "strict" or whatever keyword we
may end up) and the particular transformation and returns true of false,
depending whether the definition of fp-model allows this transformation or
not. So the transformation would request, for example, if reassociation is
allowed or not.

Another point, important from practical point of view, is that fp-model is
almost always the same for any instructions in the function (or even
module) and tagging every instruction with fp-model metadata is quite
a substantial waste of resources. So it makes sense to me to have a default
fp-model defined for the function or module, which can be overwritten with
instruction metadata.

I also understand that clang generally derives GCC switches and fp
precision switches are not an exception, but I'd like to point out that
there's a far more orderly way of defining fp precision model (IMHO, of
course :-) ), adopted by MS and Intel Compiler (-fp-model
[strict|precise|fast]). It would be nice to have it adopted in clang.

But while adding MS-style fp-model switches is different topic (and I guess
quite arguable one), I'm mentioning it to show the importance of an idea of
abstracting internal compiler fp-model from external switches and exposing
a querying interface to transformations. Transformations shouldn't care
about particular model, they need to know only if particular type of
transformation is allowed.

Dmitry.

>
> Ciao, Duncan.
>
>
>> Dmitry.
>>
>> On Sat, Apr 14, 2012 at 10:28 PM, Duncan Sands <baldrick at free.fr
>> <mailto:baldrick at free.fr>> wrote:
>>
>>    The attached patch is a first attempt at representing "-ffast-math" at
>> the IR
>>    level, in fact on individual floating point instructions (fadd, fsub
>> etc).  It
>>    is done using metadata.  We already have a "fpmath" metadata type
>> which can be
>>    used to signal that reduced precision is OK for a floating point
>> operation, eg
>>
>>        %z = fmul float %x, %y, !fpmath !0
>>      ...
>>      !0 = metadata !{double 2.5}
>>
>>    indicates that the multiplication can be done in any way that doesn't
>> introduce
>>    more than 2.5 ULPs of error.
>>
>>    The first observation is that !fpmath can be extended with additional
>> operands
>>    in the future: operands that say things like whether it is OK to
>> assume that
>>    there are no NaNs and so forth.
>>
>>    This patch doesn't add additional operands though.  It just allows the
>> existing
>>    accuracy operand to be the special keyword "fast" instead of a number:
>>
>>        %z = fmul float %x, %y, !fpmath !0
>>      ...
>>      !0 = metadata !{!metadata "fast"}
>>
>>    This indicates that accuracy loss is acceptable (just how much is
>> unspecified)
>>    for the sake of speed.  Thanks to Chandler for pushing me to do it
>> this way!
>>
>>    It also creates a simple way of getting and setting this information:
>> the
>>    FPMathOperator class: you can cast appropriate instructions to this
>> class
>>    and then use the querying/mutating methods to get/set the accuracy,
>> whether
>>    2.5 or "fast".  The attached clang patch uses this to set the openCL
>> 2.5 ULPs
>>    accuracy rather than doing it by hand for example.
>>
>>    In addition it changes IRBuilder so that you can provide an accuracy
>> when
>>    creating floating point operations.  I don't like this so much.  It
>> would
>>    be more efficient to just create the metadata once and then splat it
>> onto
>>    each instruction.  Also, if fpmath gets a bunch more options/operands
>> in
>>    the future then this interface will become more and more awkward.
>>  Opinions
>>    welcome!
>>
>>    I didn't actually implement any optimizations that use this yet.
>>
>>    I took a look at the impact on aermod.f90, a reasonably floating point
>> heavy
>>    Fortran benchmark (4% of the human readable IR consists of floating
>> point
>>    operations).  At -O3 (the worst), the size of the bitcode increases by
>> 0.8%.
>>    No idea if that's acceptable - hopefully it is!
>>
>>    Enjoy!
>>
>>    Duncan.
>>
>>    ______________________________**_________________
>>    LLVM Developers mailing list
>>    LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu
>>    http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/2ac9a4a1/attachment.html>