[LLVMdev] Representing -ffast-math at the IR level

Mon Apr 16 07:30:52 PDT 2012

[Resend as I forgot this list doesn't set reply-to to list. Oops]

On Sun, Apr 15, 2012 at 10:20 AM, Renato Golin <rengolin at systemcall.org> wrote:
> On 15 April 2012 09:07, Duncan Sands <baldrick at free.fr> wrote:
>> Link-time optimization will sometimes result in "fast-math" functions being
>> inlined into non-fast math functions and vice-versa.  This pretty much
>> inevitably means that per-instruction fpmath options are required.
>
> I guess it would be user error if a strict function used the results
> of a non-strict function (explicitly compiled with -ffast-math) and
> complain about loss of precision. In that case, the inlining keeping
> the option per-line makes total sense.

As a writer of numerical code, the perspective that's being taken
makes things seem bizarre. I would never write code/use optimizations
that I expect to produce inaccurate results. What I would do is write
code which, _for the input data that it is going to use_, is not going
to be (to any noticeable degree) any less accurate if some
optimzations are being used. (Clearly it's well known that for most
optimizations there are some sets of input data that cause big changes
in accuracy; however there seems no neat way of telling the compiler
that these aren't going to occur other than by specifying
modes/allowed transformations.) As such, inlining code that uses more
optimizations ("fast-math flagged code") into more sensitive code that
expects those inputs need "strict math" to retain the accuracy through
to the result.

My personal interest is in automatic differentiation, where there's
two kinds of "variable entities" in the
code-after-auto-differentiation: original variables and derivatives,
and it is desirable to have different fp optimizations used on the two
kinds of element. (It's quite important that 0*x-> 0 is used to shrink
down the amount of "pointless" instructions generated for
derivatives.) However, I have to admit I can't think of any other
problem where I'd want control over the fp-optimizations used on a
per-instruction level, so I don't know if it's worth it for the LLVM
codebase in general.

Finally, a minor aside: I was talking to Duncan Sands at EuroLLVM and
discussing whether the FP optimizations would apply to vector op as
well as scalar ops, and he mentioned that the plan was to mirror the
integer case where vector code should be optimized as well as scalar
code.

Since there's no FP optimizations yet, I looked at what LLVM produces
for integer code for

t0 := a * b
t1 := c * d
t2 := t0 + t1
t3 := t2 + e
return t3

in the 16 cases where both a and c are from {variable, -1, 0, +1} in
the scalar and vector cases. The good news is that in each case both
scalar and vector code gets fully optmized; interstingly however
different choices get made in a couple of cases between vector and
scalar. (Basically given an expression like w+x+y-z there are various
ways to build this from binary instructions, and different choices
seem to be made.)

Anyway, I'll rerun this test code for FP mode once there are some FP
optimizations implemented.

HTH,
Dave Tweed