[LLVMdev] Clarifying FMA-related TargetOptions
David A. Greene
dag at cray.com
Wed Feb 8 12:38:18 PST 2012
Owen Anderson <resistor at mac.com> writes:
> A related concern is that, while NoExcessFPPrecision seems applicable,
> it is the only one of the above that defaults to the more-relaxed
> option. From testing my patch, I can say that it does change the
> behavior of a number of benchmarks in the LLVM test suite, and for
> that reason alone seems like it should not be enabled by default.
>
> Anyone more knowledgable about FP than me have any ideas?
FWIW, we've found that having a switch to turn off FMA explicitly is
helpful for debugging. We don't expose the switch to users but it has
saved us a few times when trying to track down numerical differences.
Our FP switches are not so precisely named. We basically have fp0, fp1,
fp2 and fp3, analogous to O0, O1, O2, O3. The idea is that the higher
the number, the less guarantee you have that your results will be the
same as scalar code (or code w/o FMA) would give you. The tradeoff
being faster execution, of course. We don't say anything about
precision directly.
-Dave
More information about the llvm-dev
mailing list