[cfe-dev] [RFC] FP Contract = fast?

Wed Mar 15 06:27:43 PDT 2017

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato