[llvm-dev] [cfe-dev] RFC: change -fp-contract=off to actually disable FMAs

Scott Manley via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 12 09:32:57 PDT 2019


>  However, fp-contract is not a knob to control whether or not
abstract-machine operations generate a single arithmetic instruction

I think that makes sense, but the end result is the same. Wouldn't you
agree that -fp-contract=off still contracts floating point expressions with
the initial example I posted? That is the core of what I'm trying to
resolve here.

I still have some confusion of what FMAD is supposed to be. Is FMAD
actually MAD? Or is it something else? I am fine with leaving it alone if
FMAD is not actually contracting floating point operations.

On Fri, Jul 12, 2019 at 10:54 AM Stephen Canon <scanon at apple.com> wrote:

> Echoing what everyone else has said, keying on the word “fused” is a red
> herring here.
>
> fp-contract refers to behavior governed by the STDC FP_CONTRACT pragma.
> “Contraction” has a formal definition in the C standard:
>
> > A floating expression may be contracted, that is, evaluated as though it
> were a single operation, thereby omitting rounding errors implied by the
> source code and the way to disallow contracted expressions.
>
> Note that this definition is *purely* in terms of the rounding of
> arithmetic operations performed by the abstract machine; there is no notion
> of instructions generated. Formation of fused multiply-add instructions is
> one specific form of fusion licensed by this pragma, which happens to be
> the main one of interest from the standpoint of compiler performance
> optimization for FMA-based architectures.
>
> There’s some imprecision in the documentation caused by a mismatch between
> what’s interesting for compiler writers (where rounding changes due to FMA
> formation are allowed) and the abstract specification. That should be
> cleaned up. However, fp-contract is not a knob to control whether or not
> abstract-machine operations generate a single arithmetic instruction—it
> definitely does not, and should not, enable or disable MAD formation.
>
> – Steve
>
> > On Jul 10, 2019, at 6:14 PM, Scott Manley via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
> >
> > >  I think you have a different definition of fused then. Fused is a
> description of how the operation is computed/rounded, not an instruction
> count.
> >
> > "Only fuse FP ops when the result won't be affected" is what the
> existing comment says. So it can't be both a fused op and not a fused op if
> it's only meant to imply a difference in rounding. I'm just re-using the
> existing wording, and I agree it could be cleaned up if that's the intent
> of the -fp-contract option -- which I why I was asking for context.
> >
> > >  For FMA, I think your example IR is correctly handled. The fast
> instruction flag should override the global FP option you’re providing. For
> the issue you are describing, this is more of a question of whether clang
> should be emitting the fast flag or not.
> >
> > I disagree. How does clang know what would ultimately form an FMA?  It
> would have to blanket remove 'fast' from all fadds.
> >
> >>
> >> On Wed, Jul 10, 2019 at 4:16 PM Matt Arsenault <arsenm2 at gmail.com>
> wrote:
> >>
> >>> On Jul 10, 2019, at 16:56, Scott Manley <rscottmanley at gmail.com>
> wrote:
> >>>
> >>> At any rate, I was only offering an additional reason. Personally I
> think it's strange for an option to say "this will never fuse ops" and then
> under the covers will fuse ops, regardless of how FMAD is defined. However,
> my primary concern is for FMAs. They have both numeric and performance
> implications and I do not think it's unreasonable that off means off.
> >>
> >> I think you have a different definition of fused then. Fused is a
> description of how the operation is computed/rounded, not an instruction
> count. The F in FMAD is not fused (I know this naming scheme is not great.
> Every other FP node besides FMA has an F prefix)
> >>
> >> For FMA, I think your example IR is correctly handled. The fast
> instruction flag should override the global FP option you’re providing. For
> the issue you are describing, this is more of a question of whether clang
> should be emitting the fast flag or not.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190712/b8742796/attachment.html>


More information about the llvm-dev mailing list