[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Wed Nov 16 13:14:54 PST 2016


----- Original Message -----

> From: "Mehdi Amini" <mehdi.amini at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-dev at lists.llvm.org, "Warren Ristow" <warren.ristow at sony.com>
> Sent: Wednesday, November 16, 2016 11:03:48 AM
> Subject: Re: [llvm-dev] RFC: Consider changing the semantics of
> 'fast' flag implying all fast-math-flags

> > On Nov 15, 2016, at 11:59 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 

> > ----- Original Message -----
> 

> > > From: "Mehdi Amini via llvm-dev" < llvm-dev at lists.llvm.org >
> > 
> 
> > > To: "Warren Ristow" < warren.ristow at sony.com >
> > 
> 
> > > Cc: llvm-dev at lists.llvm.org
> > 
> 
> > > Sent: Tuesday, November 15, 2016 11:10:48 PM
> > 
> 
> > > Subject: Re: [llvm-dev] RFC: Consider changing the semantics of
> > > 'fast' flag implying all fast-math-flags
> > 
> 

> > > Hi,
> > 
> 

> > > > On Nov 15, 2016, at 5:15 PM, Ristow, Warren via llvm-dev <
> > > > llvm-dev at lists.llvm.org > wrote:
> > > 
> > 
> 

> > > > Hi all,
> > > 
> > 
> 

> > > > This is about https://reviews.llvm.org/D26708
> > > 
> > 
> 

> > > > Currently when the command-line switch '-ffast-math' is
> > > > specified,
> > > > the
> > > 
> > 
> 
> > > > IR-level fast-math-flag 'fast' gets attached to appropriate FP
> > > > math
> > > 
> > 
> 
> > > > instructions. That flag acts as an "umbrella" to implicitly
> > > > turn
> > > > on
> > > > all the
> > > 
> > 
> 
> > > > other fast-math-flags ('nnan', 'ninf', 'nsz' and 'arcp'):
> > > 
> > 
> 

> > > > http://llvm.org/docs/LangRef.html#fast-math-flags
> > > 
> > 
> 

> > > > This approach has the shortcoming that when there is a desire
> > > > to
> > > > disable one
> > > 
> > 
> 
> > > > of these fast-math-flags, if the 'fast' flag remains, it
> > > > implicitly
> > > 
> > 
> 
> > > > re-enables the one being disabled. For example, compiling this
> > > > test-case:
> > > 
> > 
> 

> > > > extern void use(float x, float y);
> > > 
> > 
> 
> > > > void test(float a, float b, float c) {
> > > 
> > 
> 
> > > > float q1 = a / c;
> > > 
> > 
> 
> > > > float q2 = b / c;
> > > 
> > 
> 
> > > > use(q1, q2);
> > > 
> > 
> 
> > > > }
> > > 
> > 
> 

> > > > at '-O2 -ffast-math' does a reciprocal-transformation, so only
> > > > one
> > > > division
> > > 
> > 
> 
> > > > is done (as desired with fast-math). Compiling it with:
> > > 
> > 
> 

> > > > -O2 -ffast-math -fno-reciprocal-math
> > > 
> > 
> 

> > > > should disable the reciprocal transformations (the flag
> > > > 'arcp'),
> > > > but
> > > > leave
> > > 
> > 
> 
> > > > all the other fast-math transformations enabled. The current
> > > > implementation
> > > 
> > 
> 
> > > > doesn't do that, since the 'fast' IR-level flag still gets set.
> > > 
> > 
> 

> > > > Motivation of this discussion:
> > > > https://llvm.org/bugs/show_bug.cgi?id=27372#c2
> > > 
> > 
> 

> > > > As an aside, when '-ffast-math' is specified on the
> > > > command-line,
> > > > the
> > > 
> > 
> 
> > > > following six switches are all passed to cc1:
> > > 
> > 
> 

> > > > -menable-no-infs
> > > 
> > 
> 
> > > > -menable-no-nans
> > > 
> > 
> 
> > > > -fno-signed-zeros
> > > 
> > 
> 
> > > > -freciprocal-math
> > > 
> > 
> 
> > > > -fno-trapping-math
> > > 
> > 
> 
> > > > -ffp-contract=fast
> > > 
> > 
> 

> > > > and '-ffast-math' itself is also passed cc1 (the act of passing
> > > > '-ffast-math'
> > > 
> > 
> 
> > > > to cc1 results in the macro '__FAST_MATH__' being defined).
> > > > When
> > > > (for
> > > 
> > 
> 
> > > > example) '-fno-reciprocal-math' is passed in addition to
> > > > '-ffast-math', then
> > > 
> > 
> 
> > > > '-freciprocal-math' is no longer passed to cc1 (and the other
> > > > five
> > > > listed
> > > 
> > 
> 
> > > > above still are passed, along with '-ffast-math' still being
> > > > passed).
> > > > It
> > > 
> > 
> 
> > > > seems like the intention was that these individual switches
> > > > were
> > > > to
> > > > enable
> > > 
> > 
> 
> > > > the individual floating-point transformations (and so the lack
> > > > of
> > > > any
> > > > of
> > > 
> > 
> 
> > > > those switches would suppress the relevant transformations),
> > > > but
> > > > the
> > > 
> > 
> 
> > > > '-ffast-math' "umbrella" is over-riding the attempted
> > > > suppression.
> > > 
> > 
> 
> > > Sure, this looks like a bug, disable an individual fast-math flag
> > > on
> > > the command line should be possible and override a prior
> > > -ffast-math
> > > (usually the last one on the command line “wins”/override).
> > 
> 

> > > The Cfe-dev mailing list would be more appropriate to discuss the
> > > behavior of clang command line flags though.
> > 
> 

> > > > The change proposed at https://reviews.llvm.org/D26708 deals
> > > > with
> > > > this issue
> > > 
> > 
> 
> > > This patch seems to modify on LLVM, it does not deal at all with
> > > the
> > > issue you describe above.
> > 
> 
> > > I don’t see why the issue with the clang command line flags need
> > > to
> > > be dealt with at the LLVM level.
> > 
> 

> > > > just for the reciprocal-transformation case, but it changes the
> > > > semantics of
> > > 
> > 
> 
> > > > the 'fast' IR-level flag so that it no longer implies all the
> > > > others.
> > > 
> > 
> 
> > > The starting point for any change is:
> > > http://llvm.org/docs/LangRef.html#fast-math-flags
> > 
> 
> > > You would need to write a new definition for what “fast” would
> > > mean.
> > 
> 

> > > However I don’t need anything need to be changed here to address
> > > the
> > > use-case you want to fix.
> > 
> 
> > I suspect that we want to start by getting rid of 'fast' on the IR
> > level and replacing it with individual flags for the various
> > optimization classes - Do we have only allowing reassociation and
> > libm optimizations? Then we can readjust the Clang flags in a
> > straightforward way.
> 

> Individual flags for various optimization classes make sense only if
> you don’t end up with a lot of very specialized new flags.
> If a single “reassociate” flag could be enough to complete the
> existing and replace the “fast” that would be great.
> But some auditing of all the users of “fast" would be needed first.
> For instance is "X * log2(0.5*Y) = X*log2(Y) - X” covered by
> “reassociation”? That seems a bit more than what people think about
> with reassociation at first.
I can only think of two flags we might need: One for reassociation and one for libm-function optimization. Your example with log2 might require both. 

-Hal 

>> Mehdi

> > > > With
> > > 
> > 
> 
> > > > that proposed approach, rather than an "umbrella" flag such as
> > > > 'fast'
> > > > being
> > > 
> > 
> 
> > > > checked in the back-end (along with an individual flag like
> > > > 'arcp'),
> > > > just
> > > 
> > 
> 
> > > > checking the individual flag ('arcp') would be done.
> > > 
> > 
> 
> > > There is already no need to check the “fast” *and* arcp flag, if
> > > a
> > > transformation is about reciprocal, then you only need to check
> > > arcp
> > > (fast implies arcp, checking for fast would be redundant).
> > 
> 

> > > Be careful also that the fast-math flags are mainly an IR level
> > > definition, the backend only inherited these per instruction flag
> > > very recently. It has been entirely converted to use these, and
> > > it
> > > still uses a global flag in some places.
> > 
> 
> > > The line you’re touching in your patch for instance is about this
> > > legacy:
> > 
> 

> > > if (!UnsafeMath && !Flags->hasAllowReciprocal())
> > 
> 

> > > The first flag is the global “fast-math” mode on the backend,
> > > which
> > > is not as fine grain as the per-instruction model.
> > 
> 
> > > The second flag is the “per instruction” flag, which is the model
> > > we
> > > aim at.
> > 
> 

> > > We should get rid of the “global” UnsafeMath in the backend, but
> > > that
> > > does not require any change to the IR or the individual fast-math
> > > flags.
> > 
> 

> > > > Any fast-math-related
> > > 
> > 
> 
> > > > transformation that doesn't have an individual flag (e.g.,
> > > > re-association
> > > 
> > 
> 
> > > > currently doesn't), should eventually have an individual flag
> > > > defined
> > > > for
> > > 
> > 
> 
> > > > it, and then that individual flag should be checked.
> > > 
> > 
> 

> > > > What do people think?
> > > 
> > 
> 
> > > I think these are valuable problems to solve, but you should
> > > tackle
> > > them piece by piece:
> > 
> 

> > > 1) the clang part of overriding the individual FMF and emitting
> > > the
> > > right IR is the first thing to fix.
> > 
> 
> > > 2) the backend is still using the global UnsafeFPMath and it
> > > should
> > > be killed.
> > 
> 

> > > Hope this makes sense.
> > 
> 

> > > —
> > 
> 
> > > Mehdi
> > 
> 

> > > _______________________________________________
> > 
> 
> > > LLVM Developers mailing list
> > 
> 
> > > llvm-dev at lists.llvm.org
> > 
> 
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > 
> 

> > --
> 

> > Hal Finkel
> 
> > Lead, Compiler Technology and Programming Languages
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 
-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/664fc4b3/attachment.html>


More information about the llvm-dev mailing list