<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>On Jun 5, 2012, at 3:35 PM, John McCall wrote:</div><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>On Jun 5, 2012, at 3:04 PM, Chandler Carruth wrote:</div><blockquote type="cite"><div class="gmail_quote">On Tue, Jun 5, 2012 at 2:58 PM, Stephen Canon <span dir="ltr"><<a href="mailto:scanon@apple.com" target="_blank">scanon@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5">On Jun 5, 2012, at 2:45 PM, John McCall <<a href="mailto:rjmccall@apple.com">rjmccall@apple.com</a>> wrote:<br>

<br>

> On Jun 5, 2012, at 2:15 PM, Stephen Canon wrote:<br>

><br>

>> On Jun 5, 2012, at 1:51 PM, Chandler Carruth <<a href="mailto:chandlerc@google.com">chandlerc@google.com</a>> wrote:<br>

>><br>

>>> That said, FP_CONTRACT doesn't apply to C++, and it's quite unlikely to become a serious part of the standard given these (among other) limitations. Curiously, in C++11, it may not be needed to get the benefit of fused multiply-add:<br>


>><br>

>> Perversely, a strict reading of C++11 seems (to me) to not allow FMA formation in C++ at all:<br>

>><br>

>>      • The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.<br>

>><br>

>> FMA formation does not increase the precision or range of the result (it may or may not have smaller error, but it is not more precise), so this paragraph doesn't actually license FMA formation.  I can't find anywhere else in the standard that could (though I am *far* less familiar with C++11 than C11, so I may not be looking in the right places).<br>


><br>

> Correct me if I'm wrong, but I thought that an FMA could be formalized as representing the result of the multiply with greater precision than the operation's type actually provides, and then using that as the operand of the addition.  It's understand that that can change the result of the addition in ways that aren't just "more precise".  Similarly, performing 'float' operations using x87 long doubles can change the result of the operation, but I'm pretty sure that the committees explicitly had hardware limitations like that in mind when they added this language.<br>


<br>

</div></div>That's an interesting point.  I'm inclined to agree with this interpretation (there are some minor details about whether or not 0*INF + NAN raises the invalid flag, but let's agree to ignore that).<br>


<br>

I'm not familiar enough with the language used in the C++ spec to know whether this makes C++ numerics equivalent to STDC FP_CONTRACT on, or equivalent to "allow greedy FMA formation".  Anyone?<br></blockquote>

<div><br></div><div>If you agree w/ John's interpretation, and don't consider the flag case you mention, AFAICT, this allows greedy FMA formation, unless the intermediate values are round-tripped through a cast construct such as I described.</div></div></blockquote><div><br></div>I'm still not sure why you think this restriction *only* happens when round-tripping through casts, rather than through any thing which is not an operand or result, e.g. an object.</div><div><br></div><div>Remember that the builtin operators are privileged in C++ — they are not semantically like calls, even in the cases where they're selected by overload resolution.</div><div><br></div><div>I agree that my interpretation implies that a type which merely wraps a double nonetheless forces stricter behavior.  I also agree that this sucks.</div></div></blockquote><br></div><div>To continue this thought, the most straightforward way to represent this in IR would be to (1) add a "contractable" bit to the LLVM operation (possibly as metadata) and (2) provide an explicit "value barrier" instruction (a unary operator preventing contraction "across" it).  We would introduce the barrier in the appropriate circumstances, i.e. an explicit cast, a load from a variable, or whatever else we conclude requires these semantics.  It would then be straightforward to produce FMAs from this, as well as just generally avoiding rounding when the doing sequences of illegal FP ops.  -ffast-math would imply never inserting the barriers.</div><div><br></div><div>The disadvantages I see are:</div><div>  - there might be lots of peepholes and isel patterns that would need to be taught to to look through a value barrier</div><div>  - the polarity of barriers is wrong, because code that lacks barriers is implicitly opting in to things, so e.g. LTO could pick a weak_odr function from an old tunit that lacks a barrier which a fresh compile would insist on.</div><div><br></div><div>John.</div></body></html>