[LLVMdev] Clarifying FMA-related TargetOptions
hfinkel at anl.gov
Wed Feb 8 10:42:13 PST 2012
On Wed, 2012-02-08 at 10:11 -0800, Owen Anderson wrote:
> Hello everyone,
> I'd like to propose the attached patch to form FMA intrinsics
> aggressively, but in order to do so I need some clarification on the
> intended semantics for the various FP precision-related
> TargetOptions. I've summarized the three relevant ones below:
> UnsafeFPMath - Defaults to off, enables "less precise" results than
> permitted by IEEE754. Comments specifically reference using hardware
> FSIN/FCOS on X86.
> NoExcessFPPrecision - Defaults to off (i.e. excess precision allowed),
> enables higher-precision implementations than specified by IEEE754.
> Comments reference FMA-like operations, and X87 without rounding all
> over the place.
> LessPreciseFPMADOption - Defaults to off, enables "less precise" FP
> My general sense is that aggressive FMA formation is beyond the realm
> of what UnsafeFPMath allows, but I'm unclear on the relationship
> between NoExcessFPPrecision and LessPreciseFPMADOption. My
> understanding is that fused multiply-add operations are "more
> precise" (i.e. closer to the numerically true value) than the baseline
> (which would round between the multiply and the add). By that
> reasoning, it seems like it should be covered by !NoExcessFPPrecision.
I agree, and this is what the PPC backend does.
> However, that opens the question of what LessPreciseFPMADOption is
> intended to cover. Are there targets on which FMA is actually "less
> precise" than the baseline sequence? Or is the comment just poorly
> A related concern is that, while NoExcessFPPrecision seems applicable,
> it is the only one of the above that defaults to the more-relaxed
> option. From testing my patch, I can say that it does change the
> behavior of a number of benchmarks in the LLVM test suite, and for
> that reason alone seems like it should not be enabled by default.
This does not surprise me, however, care is required here. First, there
has been a previous thread on this recently, and a specifically
recommend that you read Stephen Canon's remarks:
In my experience, users of numerical codes expect that the compiler will
use FMA instructions where it can, unless specifically asked to avoid
doing so by the user. Even though this can sometimes produce a different
result (*almost* always a better one), the performance gain is too large
to be ignored by default. I highly recommend that we continue to enable
FMA instruction-generation by default (as is the current practice, not
only here, but in most vendor compilers with which I am familiar). We
should also implement the FP_CONTRACT pragma, but that is another
> Anyone more knowledgable about FP than me have any ideas?
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev