[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

Fri Jan 24 12:36:49 PST 2020

wristow added a comment.

> A separate question is the interaction of `-ffast-math` with `-ffp-contract=`.  Currently, there is no such interaction whatsoever in GCC: `-ffast-math` does not imply any particular `-ffp-contract=` setting, and vice versa the `-ffp-contract=` setting is not considered at all when defining `__FAST_MATH__`. This seems at least internally consistent.

That's interesting, and as you said, internally consistent behavior in GCC.  I think it would be a fine idea for us to do the same thing.

Looking into this point, I see that (ignoring fastmath for the moment) our default `-ffp-contract` behavior is different than GCC's.  GCC enables FMA by default when optimization is high enough ('-O2') without any special switch needed.  For example, taking an architecture that supports FMA (Haswell), GCC has the following behavior:

  float test(float a, float b, float c)
  {
    // FMA is enabled by default for GCC (on Haswell), so done at -O2:
    //    gcc -S -O2 -march=haswell test.c  # FMA happens
    //    $ gcc -S -march=haswell test.c ; egrep 'mul|add' test.s
    //            vmulss  -12(%rbp), %xmm0, %xmm0
    //            vaddss  -4(%rbp), %xmm0, %xmm0
    //    $ gcc -S -O2 -march=haswell test.c ; egrep 'mul|add' test.s
    //            vfmadd231ss     %xmm2, %xmm1, %xmm0
    //    $
    return a + (b * c);
  }

As we'd expect, GCC does disable FMA with `-ffp-contract=off` (this is irrespective of whether `-ffast-math` was specified).  Loosely, GCC's behavior can be summarized very simply on this point as:
//Suppress FMA when `-ffp-contract=off`.//
(As an aside, GCC's behavior with `-ffp-contract=on` is non-intuitive to me.  It relates to the FP_CONTRACT pragma, which as far as I can see is ignored by GCC.)

In contrast, we do //not// enable FMA by default (via general optimization, such as '-O2').  For example:

  $ clang -S -O2 -march=haswell test.c ; egrep 'mul|add' test.s
          vmulss  %xmm2, %xmm1, %xmm1
          vaddss  %xmm0, %xmm1, %xmm0
          .addrsig
  $

I think that whether we want to continue doing that (or instead, enable it at '-O2', like GCC does), is a separate issue.  I can see arguments either way.

We do enable FMA with `-ffp-contract=fast`, as desired (and also with `-ffp-contract=on`).  And we do "leave it disabled" with `-ffp-contract=off` (as expected).

Now, bringing fastmath back into the discussion, we //do// enable FMA with `-fffast-math`.  If we decide to continue leaving it disabled by default, then enabling it with `-ffast-math` seems perfectly sensible.  (If we decide to enable it by default when optimization is high enough, like GCC, then turning on `-ffast-math` should not disable it of course.)

The problem I want to address here is that if the compiler is a mode where FMA is enabled (whether that's at '-O2' "by default", or whether it's because the user turned on `-ffast-math`), then appending `-ffp-contract=off` //should// disable FMA.  I think this patch (along with the small change to "DAGCombiner.cpp", in an earlier version of this patch) is a reasonable approach to solving that.  I'd say this patch/review/discussion has raised two additional questions:

1. Under what conditions should `__FAST_MATH__` be defined?
2. Should we enable FMA "by default" at (for example) '-O2'?

I think these additional questions are best addressed separately.  My 2 cents are that for (1), mimicking GCC's behavior seems reasonable (although that's assuming we don't find out that GCC's `__FAST_MATH__` behavior is a bug).  And for (2), I don't have a strong preference.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72675/new/

https://reviews.llvm.org/D72675