[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

Ristow, Warren via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 15 17:15:34 PST 2016


Hi all,

This is about https://reviews.llvm.org/D26708

Currently when the command-line switch '-ffast-math' is specified, the
IR-level fast-math-flag 'fast' gets attached to appropriate FP math
instructions.  That flag acts as an "umbrella" to implicitly turn on all the
other fast-math-flags ('nnan', 'ninf', 'nsz' and 'arcp'):

http://llvm.org/docs/LangRef.html#fast-math-flags

This approach has the shortcoming that when there is a desire to disable one
of these fast-math-flags, if the 'fast' flag remains, it implicitly
re-enables the one being disabled.  For example, compiling this test-case:

    extern void use(float x, float y);
    void test(float a, float b, float c) {
      float q1 = a / c;
      float q2 = b / c;
      use(q1, q2);
    }

at '-O2 -ffast-math' does a reciprocal-transformation, so only one division
is done (as desired with fast-math).  Compiling it with:

  -O2 -ffast-math -fno-reciprocal-math

should disable the reciprocal transformations (the flag 'arcp'), but leave
all the other fast-math transformations enabled.  The current implementation
doesn't do that, since the 'fast' IR-level flag still gets set.

Motivation of this discussion: https://llvm.org/bugs/show_bug.cgi?id=27372#c2

As an aside, when '-ffast-math' is specified on the command-line, the
following six switches are all passed to cc1:

    -menable-no-infs
    -menable-no-nans
    -fno-signed-zeros
    -freciprocal-math
    -fno-trapping-math
    -ffp-contract=fast

and '-ffast-math' itself is also passed cc1 (the act of passing '-ffast-math'
to cc1 results in the macro '__FAST_MATH__' being defined).  When (for
example) '-fno-reciprocal-math' is passed in addition to '-ffast-math', then
'-freciprocal-math' is no longer passed to cc1 (and the other five listed
above still are passed, along with '-ffast-math' still being passed).  It
seems like the intention was that these individual switches were to enable
the individual floating-point transformations (and so the lack of any of
those switches would suppress the relevant transformations), but the
'-ffast-math' "umbrella" is over-riding the attempted suppression.

The change proposed at https://reviews.llvm.org/D26708 deals with this issue
just for the reciprocal-transformation case, but it changes the semantics of
the 'fast' IR-level flag so that it no longer implies all the others.  With
that proposed approach, rather than an "umbrella" flag such as 'fast' being
checked in the back-end (along with an individual flag like 'arcp'), just
checking the individual flag ('arcp') would be done.  Any fast-math-related
transformation that doesn't have an individual flag (e.g., re-association
currently doesn't), should eventually have an individual flag defined for
it, and then that individual flag should be checked.

What do people think?

Thanks,
-Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/9337607e/attachment.html>


More information about the llvm-dev mailing list