[cfe-commits] [llvm-commits] [PATCH] Add llvm.fmuladd intrinsic.

Tue Jun 5 14:15:55 PDT 2012

On Jun 5, 2012, at 1:51 PM, Chandler Carruth <chandlerc at google.com> wrote:

> Trying to at least do my homework, as I'm not usually working w/ numerics, I've been reading up.
> 
> I've now read the FP_CONTRACT part of the C11 spec, and see where your statement comes from. I find this restriction... mysterious. I would love to understand why it is important to prevent inlining from exposing contraction opportunities if you can give any examples.

Largely it's for historical reasons; there is a lot of existing code that expects operations spanning expressions to not be contracted.  It's critical to be able to forbid FMA formation for some computations, and some people happen to have written code in this style (probably because they, too, hate pragmas!)

If I remember right, Jim Thomas at HP was involved in the design of FP_CONTRACT in C99; I'll ask him if he can provide a clearer rationale if you're really interested.

> That said, FP_CONTRACT doesn't apply to C++, and it's quite unlikely to become a serious part of the standard given these (among other) limitations. Curiously, in C++11, it may not be needed to get the benefit of fused multiply-add:

Perversely, a strict reading of C++11 seems (to me) to not allow FMA formation in C++ at all:

	• The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby. 

FMA formation does not increase the precision or range of the result (it may or may not have smaller error, but it is not more precise), so this paragraph doesn't actually license FMA formation.  I can't find anywhere else in the standard that could (though I am *far* less familiar with C++11 than C11, so I may not be looking in the right places).

> The state of C++11 makes my (somewhat crazy) idea of a flag a less attractive representation, as does the C11 contraction specification, but it still doesn't make me enthused about the default representation becoming an intrinsic, and forcing the FE to pre-fuse all of these rather than marking the range of fuse-able operations and allowing the middle end to perform the fusion. I'm actually beginning to like the start/stop intrinsic pair to represent the sequences of ineligible operations.

The trouble I see with this is that you're going to end up generating an enormous number of start/stop intrinsics for some code (one pair for every source expression containing FP, effectively).  I'm not sure how much of a concern that really is, but it feels inelegant to me.

It's worth remembering in all of this that we also want to (/will) have a third "fast math" mode of operation in which greedy FMA formation is licensed, regardless of the provenance of the fmul and fadd that are fused.  That doesn't need any front-end involvement, however, so that's outside the scope of the changes that Lang has prepared.

- Steve