[LLVMdev] Clarifying FMA-related TargetOptions

James Molloy James.Molloy at arm.com
Wed Feb 8 10:44:38 PST 2012


Hi Owen,

Having looked into this due to Clang failing PlumHall with it recently I can give an opinion...

I think !NoExcessFPPrecision covers FMA completely. There are indeed some algorithms which give incorrect results when FMA is enabled, examples being those that do floating point comparisons such as: a * b + c - d. If c == d, it is still possible for that result not to equal a*b, as "+c " will have been fused with the multiply whereas "- d" won't.

I think Andy Trick (I think?!) gave a less contrived example a couple of weeks back.

Therefore, it shouldn't be enabled by default. I say that because the C standard defines a pragma to control it - #pragma FP_CONTRACT - which is what Clang was failing with in PlumHall. This pragma defines a code section where FMA may or may not be enabled. If we lack the ability to pass that information through from the frontend to the backend (which we do, at the moment), we should not enable the optimisation by default.

That said, I think we should enhance the IR to allow this information to be passed from front to back ends. An attribute on fadd, fmul, fdiv, frem and fsub in the same vein as "nsw" would be my suggestion.

Cheers,

James
________________________________________
From: llvmdev-bounces at cs.uiuc.edu [llvmdev-bounces at cs.uiuc.edu] On Behalf Of Owen Anderson [resistor at mac.com]
Sent: 08 February 2012 18:11
To: List
Subject: [LLVMdev] Clarifying FMA-related TargetOptions

Hello everyone,

I'd like to propose the attached patch to form FMA intrinsics aggressively, but in order to do so I need some clarification on the intended semantics for the various FP precision-related TargetOptions.  I've summarized the three relevant ones below:

UnsafeFPMath - Defaults to off, enables "less precise" results than permitted by IEEE754.  Comments specifically reference using hardware FSIN/FCOS on X86.

NoExcessFPPrecision - Defaults to off (i.e. excess precision allowed), enables higher-precision implementations than specified by IEEE754.  Comments reference FMA-like operations, and X87 without rounding all over the place.

LessPreciseFPMADOption - Defaults to off, enables "less precise" FP multiply-add.

My general sense is that aggressive FMA formation is beyond the realm of what UnsafeFPMath allows, but I'm unclear on the relationship between NoExcessFPPrecision and LessPreciseFPMADOption.  My understanding is that fused multiply-add operations are "more precise" (i.e. closer to the numerically true value) than the baseline (which would round between the multiply and the add).  By that reasoning, it seems like it should be covered by !NoExcessFPPrecision.  However, that opens the question of what LessPreciseFPMADOption is intended to cover.  Are there targets on which FMA is actually "less precise" than the baseline sequence?  Or is the comment just poorly worded?

A related concern is that, while NoExcessFPPrecision seems applicable, it is the only one of the above that defaults to the more-relaxed option.  From testing my patch, I can say that it does change the behavior of a number of benchmarks in the LLVM test suite, and for that reason alone seems like it should not be enabled by default.

Anyone more knowledgable about FP than me have any ideas?

--Owen


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.





More information about the llvm-dev mailing list