[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags
Ristow, Warren via llvm-dev
llvm-dev at lists.llvm.org
Tue Nov 15 17:15:34 PST 2016
Hi all,
This is about https://reviews.llvm.org/D26708
Currently when the command-line switch '-ffast-math' is specified, the
IR-level fast-math-flag 'fast' gets attached to appropriate FP math
instructions. That flag acts as an "umbrella" to implicitly turn on all the
other fast-math-flags ('nnan', 'ninf', 'nsz' and 'arcp'):
http://llvm.org/docs/LangRef.html#fast-math-flags
This approach has the shortcoming that when there is a desire to disable one
of these fast-math-flags, if the 'fast' flag remains, it implicitly
re-enables the one being disabled. For example, compiling this test-case:
extern void use(float x, float y);
void test(float a, float b, float c) {
float q1 = a / c;
float q2 = b / c;
use(q1, q2);
}
at '-O2 -ffast-math' does a reciprocal-transformation, so only one division
is done (as desired with fast-math). Compiling it with:
-O2 -ffast-math -fno-reciprocal-math
should disable the reciprocal transformations (the flag 'arcp'), but leave
all the other fast-math transformations enabled. The current implementation
doesn't do that, since the 'fast' IR-level flag still gets set.
Motivation of this discussion: https://llvm.org/bugs/show_bug.cgi?id=27372#c2
As an aside, when '-ffast-math' is specified on the command-line, the
following six switches are all passed to cc1:
-menable-no-infs
-menable-no-nans
-fno-signed-zeros
-freciprocal-math
-fno-trapping-math
-ffp-contract=fast
and '-ffast-math' itself is also passed cc1 (the act of passing '-ffast-math'
to cc1 results in the macro '__FAST_MATH__' being defined). When (for
example) '-fno-reciprocal-math' is passed in addition to '-ffast-math', then
'-freciprocal-math' is no longer passed to cc1 (and the other five listed
above still are passed, along with '-ffast-math' still being passed). It
seems like the intention was that these individual switches were to enable
the individual floating-point transformations (and so the lack of any of
those switches would suppress the relevant transformations), but the
'-ffast-math' "umbrella" is over-riding the attempted suppression.
The change proposed at https://reviews.llvm.org/D26708 deals with this issue
just for the reciprocal-transformation case, but it changes the semantics of
the 'fast' IR-level flag so that it no longer implies all the others. With
that proposed approach, rather than an "umbrella" flag such as 'fast' being
checked in the back-end (along with an individual flag like 'arcp'), just
checking the individual flag ('arcp') would be done. Any fast-math-related
transformation that doesn't have an individual flag (e.g., re-association
currently doesn't), should eventually have an individual flag defined for
it, and then that individual flag should be checked.
What do people think?
Thanks,
-Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/9337607e/attachment.html>
More information about the llvm-dev
mailing list