[cfe-commits] [llvm-commits] [PATCH] Add llvm.fmuladd intrinsic.
rjmccall at apple.com
Tue Jun 5 14:29:33 PDT 2012
On Jun 5, 2012, at 1:51 PM, Chandler Carruth wrote:
> On Tue, Jun 5, 2012 at 1:18 PM, Chandler Carruth <chandlerc at google.com> wrote:
> On Tue, Jun 5, 2012 at 1:15 PM, Stephen Canon <scanon at apple.com> wrote:
> On Jun 5, 2012, at 1:08 PM, Chandler Carruth <chandlerc at google.com> wrote:
>> Hey Lang,
>> Sorry to jump in late, but was catching on up email and finally read through this thread. This is the exchange that caught my interest:
>> On Fri, Jun 1, 2012 at 4:50 AM, Stephen Canon <scanon at apple.com> wrote:
>> On May 31, 2012, at 10:40 PM, John McCall <rjmccall at apple.com> wrote:
>> > On May 31, 2012, at 7:22 PM, Lang Hames wrote:
>> >> Thanks for the suggestion Matthieu. I spoke to Doug and he recommended using attributes rather than a FunctionDecl bit to represent the fp_contract state.
>> > Hmm. I had suggested a bit on FunctionDecl on the assumption that this would often be controlled globally, maybe by using a flag to control the default or by activating a #pragma before including all the headers. Actually, I could even imagine a target (maybe a GPU target?) even opting-in to this behavior by default. If we're going to use an Attr, we need to make sure it doesn't get added unless the current #pragma state is different from the global default; we really don't want to be allocating an attribute for every function definition in the translation unit.
>> We want FP_CONTRACT ON to be the default for all targets. It's also worth noting that it's critical that we support setting the pragma to OFF, but in practice this will be exceedingly rare (almost certainly less than 1% of sources, and probably far less than that).
>> Based on this comment, I'm really not keen on the current representation, but maybe I've mis-understood it, so I'll ask questions first:
>> The 'fmuladd' intrinsic is used to whitelist specific operations for fused multiply+add handling, correct?
>> If so, and if Stephen's stance is correct (I certainly agree with it!) that this should be allowed for the vast majority of code, that means that almost every fmul and fadd in the current IR should be a candidate for fusing?
> Only those that originate from a common source-language *expression*. Your examples should not be fused because the multiply and add are in two separate expressions (which is why we need FE involvement; that information isn't available later).
> Ok, now I'm extra confused. Thanks for replying, hopefully you can help me understand better.
> Why would it not be OK to fuse multiplies and adds that occur in two source-language expressions? I have some vague memory of Fortran having lots of special rules about within-expression semantics versus semantics across expressions, but C++ has no such constraints to my knowledge, nor would it want them.
> Having these types of artificial source-representation restrictions on semantics in C++ undermines specific language constructs like overloaded operators and transparent "wrapper" classes.
> Trying to at least do my homework, as I'm not usually working w/ numerics, I've been reading up.
> I've now read the FP_CONTRACT part of the C11 spec, and see where your statement comes from. I find this restriction... mysterious. I would love to understand why it is important to prevent inlining from exposing contraction opportunities if you can give any examples.
> That said, FP_CONTRACT doesn't apply to C++, and it's quite unlikely to become a serious part of the standard given these (among other) limitations. Curiously, in C++11, it may not be needed to get the benefit of fused multiply-add:
> [expr] p11 seems to indicate that in C++, we are almost always allowed to use increased precision to represent operations. The only exception we can find in the C++ standard (and thanks to Richard for helping me crawl through this part) is this:
I think that, if you admit you don't understand why any of the restrictions are there, it's a bit disingenuous to argue that they can't possibly be intended to restrict things that you'd rather not restrict. :)
[expr]p11 gives us leeway when working with "operands" and "results". It's not obvious that that gives us any cover to extend the precision of a value that has, e.g., round-tripped through an actual formal object, e.g. a parameter (but not necessarily a return value).
I agree that forming FMAs in the frontend is a very conservative way of taking advantage of this. Given that we want/have an FMA intrinsic anyway, though, it doesn't seem like an actively damaging sort of conservatism, and it does achieve progress.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-commits