[PATCH] D67517: Create UsersManual section entitled 'Controlling Floating Point Behavior'

Mon Sep 16 10:40:44 PDT 2019

mibintc marked 12 inline comments as done.
mibintc added inline comments.

================
Comment at: clang/docs/UsersManual.rst:1152
+
+  This option implies:
+
----------------
rjmccall wrote:
> cameron.mcinally wrote:
> > mibintc wrote:
> > > rjmccall wrote:
> > > > To be clear, fast math is not *equivalent* to just setting all of these separately, right?
> > > I think in fact ffast-math is the same as selecting all the invidivdual settings, part of the logic checks if all the features are enabled and then also sets the fast-math macro. 
> > Just checked for crtfastmath.o on the link line and they do not seem to be "equivalent", unless I made a mistake.
> > 
> > That said, I think the language "implies" is acceptable here.
> Even if the only difference was the macro, that would be quite important — the macros can result in substantially different behavior from math libraries, IIRC.  So I would appreciate a wording that carries less of a connotation that this is equivalent to this fixed set of other options.  Maybe "Some of the individual optimizations performed by fast-math can be separately enabled or disabled with the following options:"?
> 
> Melanie, are you saying that `__FAST_MATH__` is defined only if *all* of these options are set?  I think that's important to cover as well; perhaps:
> 
>   "``-ffast-math`` causes the macro ``__FAST_MATH__`` to be defined.  Some math libraries recognize this macro and change their behavior.  Using any of the options below to disable any of the individual optimizations in ``-ffast-math`` will cause ``__FAST_MATH__`` to no longer be set."
i made a correction to the text, the code doesn't check of -ffp-contract=fast, it checks if the other bits are set a certain way, then it pushes -ffast-math into the args. 

================
Comment at: clang/docs/UsersManual.rst:1168
+
+   * ``-ffp-constract=fast``
+
----------------
cameron.mcinally wrote:
> andrew.w.kaylor wrote:
> > rjmccall wrote:
> > > Typo
> > Is there not an option corresponding to the "afn" fast math flag?
> That's probably `-fno-math-errno`. I don't know for sure, though.
-ffast-math does enable HasApproxFuncs but there doesn't seem to be a command line option to control it.  FMF are initialized via "FMF.setFast" which sets that bit.  I don't see a gcc option for this eitiher.

================
Comment at: clang/docs/UsersManual.rst:1172
+
+   Select which denormal numbers the code is permitted to require.
+
----------------
rjmccall wrote:
> I know this is existing text, but what does it mean to "require" a denormal number?  The values make it seem like this changes basic behavior, like a sort of rounding mode except for denormals.
In this thread discussing the option--also referred to by Andy-- http://clang-developers.42468.n3.nabble.com/fdenormal-fp-math-td4055684.html , "I believe that flag is taken as a statement about what
the compiler can assume about the FP environment. It is not something we
attempt to ensure at startup."

================
Comment at: clang/docs/UsersManual.rst:1180
+
+   Defaults to ``ieee``.
+
----------------
cameron.mcinally wrote:
> andrew.w.kaylor wrote:
> > cameron.mcinally wrote:
> > > andrew.w.kaylor wrote:
> > > > rjmccall wrote:
> > > > > mibintc wrote:
> > > > > > I'm not certain this is correct description of the Default.  Previously in the documentation the default ws not described.  I looked at the code and in Target.h it is initialized to ieee but possibly other specializations would replace the initial value. 
> > > > > If it's target-specific, that seems relevant to describe.
> > > > It looks like this is currently only supported for ARM targets. I also saw a CUDA-specific flag for this.
> > > > 
> > > > X86 targets won't be able to support "positive-zero" mode and "preserve-sign" (which we typically just call "ftz") would only take effect for SSE instructions (i.e. not x87), but we would like to add "denormals-are-zero" (or "daz") which means that denormal operands will be treated as if they were zero by FP instructions. Also "ftz' and "daz" can be used together so I guess we'll need a "daz+ftz" option. All of this depends on having a backend implementation of course, but we're going to want to do this soon. If we can figure out the command line interface, I can talk to someone about getting the x86 codegen part implemented.
> > > > All of this depends on having a backend implementation of course, but we're going to want to do this soon. If we can figure out the command line interface, I can talk to someone about getting the x86 codegen part implemented.
> > > 
> > > This is currently handled for -ffast-math/-Ofast by including crtfastmath.o on the link line. IIRC, all it does is set FTZ+DAZ. Can we piggyback off that?
> > > 
> > > That's done because we want FTZ+DAZ to be toggled at start-up, so static initializers and such get the benefit. I suppose it would be possible to do in CodeGen, but sounds tricky...
> > I found this discussion useful: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053011.html
> > 
> > It refers to this bug: https://bugs.llvm.org//show_bug.cgi?id=14024
> > 
> > On some platforms FTZ and DAZ will be set if -ffast-math is used with -Ofast or -ffast-math if crtfastmath.o can be found. For the sake of the current optimization update, I suppose we need to document that fact and mention that -fdenormal-fp-math doesn't actually control it (except on ARM?).
> > 
> > I don't think this is the behavior we want though because they're either both on or both left in the system default state. I'll see if I can get a discussion going about getting this implemented in a consistent (or at least consistently sensible) way across architectures and operating systems.
> Good idea about the discussion.
> 
> > I don't think this is the behavior we want though because they're either both on or both left in the system default state.
> 
> I could see the purists wanting to toggle both separately, but I'm not sure it has a lot of practical value. E.g. even if you produced a denorm with no-FTZ, the first use of it will flush it to zero with DAZ. Unless the user is only doing one calculation and then writing to output, that is.
> 
> On the other hand, if you FTZ, there would be no need to respect no-DAZ since you can't create denorms. Well, that's not entirely true if someone really cares about the precision of their input. But, yeah, those users probably won't be throwing out denorms anyway.
> 
> I'm also guessing the performance gains of toggling one alone isn't that great.
> 
> If we were to separate the two though, it might make sense to create two compiler-rt functions to achieve that. Assuming neither wipe out the MXCSR (or equivalent), like crtfastmath.o does.
> 
Reviewers, can you suggest improvement to the current documentation for option ``-fdenormal-fp-math=`` 

================
Comment at: clang/docs/UsersManual.rst:1188
+   With the 'no-strict' option, Clang attempts to match the overflowing behavior
+   of the target's native float-to-int conversion instructions.
+
----------------
rjmccall wrote:
> "According to the C standard, when a floating-point value is converted to
> integer type, the fractional component of the value is discarded, and if the
> resulting integer is not expressible in the destination type, the conversion
> has undefined behavior.  By default, Clang follows this rule strictly and does
> not guarantee any particular result.  In non-strict mode, this behavior
> becomes well-defined, and Clang uses the standard behavior for the target:
> typically, the behavior of the target architecture's float-to-int conversion
> instructions."
> 
> Ideally this would then specify that behavior for various targets.
i think it would be difficult to determine the behavior of the target arch float-to-int conversion instructions for all supported targets. Maybe documenting this detaill should be part of adding a new target to clang/llvm. 

================
Comment at: clang/docs/UsersManual.rst:1199
+   floating point operations cannot generate traps such as divide-by-zero,
+   overflow and underflow. Defaults to ``-ftrapping-math``.
+
----------------
andrew.w.kaylor wrote:
> I think for clang the default is "-fno-trapping-math", isn't it? Whether or not it is currently setting internal flags that way "-fno-trapping-math" is the behavior you get from the LLVM optimizer if you don't generate constrained intrinsics.
-ffast-math sets no-trapping-math, and the default is -fno-fast-math, so the default is trapping-math. However, currently trapping-math is not functional, the option has no effect other than possibly disrupting the action of the ffast-math option.  I mean, if you have ffast-math on the command line and you also have -ftrapping-math, the FAST_MATH macro would not be defined, and ffast-math would not be pushed into the Args. 

================
Comment at: clang/docs/UsersManual.rst:1203
+
+   Form fused floating point operations (e.g. FMAs).  Valid values are:
+
----------------
rjmccall wrote:
> "Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA).  Fused operations are permitted to produce more precise results than performing the same operations separately.
> 
> The C standard permits intermediate floating-point results within an expression to be computed with more precision than their type would normally allow.  This permits operation fusing, and Clang takes advantage of this by default.  This behavior can be controlled with the FP_CONTRACT pragma.
> 
> Valid values for this option are:"
> 
> The second paragraph should probably describe how the pragma interacts with the option.
I made the text changes you suggested, but i think that the interaction with the pragma should be described in the pragma section.  currently there's already information about the option in the pragma section. I'll add a sentence referring the reader to the pragma documentation.

================
Comment at: clang/docs/UsersManual.rst:1213
+
+.. option:: -f[no-]honor-nans
+
----------------
andrew.w.kaylor wrote:
> rjmccall wrote:
> > andrew.w.kaylor wrote:
> > > rjmccall wrote:
> > > > cameron.mcinally wrote:
> > > > > mibintc wrote:
> > > > > > Both honor-nans and honor-infinities appear to have no effect at the moment, previously they were mapped to -menable-no-infs -menable-no-nans
> > > > > I believe others have the same problem too. All probably need to be reassessed. E.g. I think -f[no-]trapping-math isn't hooked up to anything in LLVM.
> > > > > 
> > > > > This may also get confusing when the -fp-model=/etc options are added, since they really do the same thing in different ways.
> > > > > 
> > > > > Is there a plan in place for how to handle this? Will the GCC options alias the Clang options?
> > > > If we accept these options, we should document them, even if just to say that they currently have no effect.  But I suspect that what this means is that they have no effect *separately* and we currently only enable NaN/Inf-unsafe optimizations given a more aggressive option like fast-math.
> > > We have a fast-math flag for this (ninf and nnan). The front end should be connecting these options to those flags.
> > Okay.  Let's keep this patch about documentation, and if we find semantic problems, we can address those in follow-ups.  So if these currently do nothing, we should document them that way.
> Sounds good.
yes, if both the options are on the command line, then -ffinite-math-only is pushed into Args; I'll put this into the next upload. If used individually then these 2 options have no effect.

  // Handle __FINITE_MATH_ONLY__ similarly.
  if (!HonorINFs && !HonorNaNs)
    CmdArgs.push_back("-ffinite-math-only");

================
Comment at: clang/docs/UsersManual.rst:1213
+
+.. option:: -f[no-]honor-nans
+
----------------
mibintc wrote:
> andrew.w.kaylor wrote:
> > rjmccall wrote:
> > > andrew.w.kaylor wrote:
> > > > rjmccall wrote:
> > > > > cameron.mcinally wrote:
> > > > > > mibintc wrote:
> > > > > > > Both honor-nans and honor-infinities appear to have no effect at the moment, previously they were mapped to -menable-no-infs -menable-no-nans
> > > > > > I believe others have the same problem too. All probably need to be reassessed. E.g. I think -f[no-]trapping-math isn't hooked up to anything in LLVM.
> > > > > > 
> > > > > > This may also get confusing when the -fp-model=/etc options are added, since they really do the same thing in different ways.
> > > > > > 
> > > > > > Is there a plan in place for how to handle this? Will the GCC options alias the Clang options?
> > > > > If we accept these options, we should document them, even if just to say that they currently have no effect.  But I suspect that what this means is that they have no effect *separately* and we currently only enable NaN/Inf-unsafe optimizations given a more aggressive option like fast-math.
> > > > We have a fast-math flag for this (ninf and nnan). The front end should be connecting these options to those flags.
> > > Okay.  Let's keep this patch about documentation, and if we find semantic problems, we can address those in follow-ups.  So if these currently do nothing, we should document them that way.
> > Sounds good.
> yes, if both the options are on the command line, then -ffinite-math-only is pushed into Args; I'll put this into the next upload. If used individually then these 2 options have no effect.
> 
>   // Handle __FINITE_MATH_ONLY__ similarly.
>   if (!HonorINFs && !HonorNaNs)
>     CmdArgs.push_back("-ffinite-math-only");
> 
> 
@cameron.mcinally fyi @rjmccall asked me to create a separate floating point options section [this review], then after it's committed, create another patch where we can consider adding fp-model options

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67517/new/

https://reviews.llvm.org/D67517