[llvm-commits] [cfe-commits] [PATCH] Add llvm.fmuladd intrinsic.

Lang Hames lhames at gmail.com
Tue Jun 5 22:34:56 PDT 2012


Thanks all for the feedback.

My big take-away from discussing this with Chandler is that I didn't
explain the motivation for the existing design well. I'll keep that in mind
in future. The reason I like fmuladd as a way to get started on FP_CONTRACT
support is simply because it's lightweight, and captures most of the cases
that we care about. The heavy lifting for proper FP_CONTRACT support, such
as there is, will be teaching the parser how to properly deal with
FP_CONTRACT pragmas applied to subexpressions (This is probably a simple
task for people who are familiar with clang, but it is new territory for
me). Since fmuladd itself is so trivial, it will be easy to replace with a
more comprehensive system for tracking fusing opportunities if/when we
decide it's called for.

- Lang.

On Tue, Jun 5, 2012 at 9:24 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> On Tue, 05 Jun 2012 20:12:00 -0700
> John McCall <rjmccall at apple.com> wrote:
>
> > On Jun 5, 2012, at 3:35 PM, John McCall wrote:
> > > On Jun 5, 2012, at 3:04 PM, Chandler Carruth wrote:
> > >> On Tue, Jun 5, 2012 at 2:58 PM, Stephen Canon <scanon at apple.com>
> > >> wrote: On Jun 5, 2012, at 2:45 PM, John McCall
> > >> <rjmccall at apple.com> wrote:
> > >>
> > >> > On Jun 5, 2012, at 2:15 PM, Stephen Canon wrote:
> > >> >
> > >> >> On Jun 5, 2012, at 1:51 PM, Chandler Carruth
> > >> >> <chandlerc at google.com> wrote:
> > >> >>
> > >> >>> That said, FP_CONTRACT doesn't apply to C++, and it's quite
> > >> >>> unlikely to become a serious part of the standard given these
> > >> >>> (among other) limitations. Curiously, in C++11, it may not be
> > >> >>> needed to get the benefit of fused multiply-add:
> > >> >>
> > >> >> Perversely, a strict reading of C++11 seems (to me) to not
> > >> >> allow FMA formation in C++ at all:
> > >> >>
> > >> >>      • The values of the floating operands and the results of
> > >> >> floating expressions may be represented in greater precision
> > >> >> and range than that required by the type; the types are not
> > >> >> changed thereby.
> > >> >>
> > >> >> FMA formation does not increase the precision or range of the
> > >> >> result (it may or may not have smaller error, but it is not
> > >> >> more precise), so this paragraph doesn't actually license FMA
> > >> >> formation.  I can't find anywhere else in the standard that
> > >> >> could (though I am *far* less familiar with C++11 than C11, so
> > >> >> I may not be looking in the right places).
> > >> >
> > >> > Correct me if I'm wrong, but I thought that an FMA could be
> > >> > formalized as representing the result of the multiply with
> > >> > greater precision than the operation's type actually provides,
> > >> > and then using that as the operand of the addition.  It's
> > >> > understand that that can change the result of the addition in
> > >> > ways that aren't just "more precise".  Similarly, performing
> > >> > 'float' operations using x87 long doubles can change the result
> > >> > of the operation, but I'm pretty sure that the committees
> > >> > explicitly had hardware limitations like that in mind when they
> > >> > added this language.
> > >>
> > >> That's an interesting point.  I'm inclined to agree with this
> > >> interpretation (there are some minor details about whether or not
> > >> 0*INF + NAN raises the invalid flag, but let's agree to ignore
> > >> that).
> > >>
> > >> I'm not familiar enough with the language used in the C++ spec to
> > >> know whether this makes C++ numerics equivalent to STDC
> > >> FP_CONTRACT on, or equivalent to "allow greedy FMA formation".
> > >> Anyone?
> > >>
> > >> If you agree w/ John's interpretation, and don't consider the flag
> > >> case you mention, AFAICT, this allows greedy FMA formation, unless
> > >> the intermediate values are round-tripped through a cast construct
> > >> such as I described.
> > >
> > > I'm still not sure why you think this restriction *only* happens
> > > when round-tripping through casts, rather than through any thing
> > > which is not an operand or result, e.g. an object.
> > >
> > > Remember that the builtin operators are privileged in C++ — they
> > > are not semantically like calls, even in the cases where they're
> > > selected by overload resolution.
> > >
> > > I agree that my interpretation implies that a type which merely
> > > wraps a double nonetheless forces stricter behavior.  I also agree
> > > that this sucks.
> >
> > To continue this thought, the most straightforward way to represent
> > this in IR would be to (1) add a "contractable" bit to the LLVM
> > operation (possibly as metadata) and (2) provide an explicit "value
> > barrier" instruction (a unary operator preventing contraction
> > "across" it).  We would introduce the barrier in the appropriate
> > circumstances, i.e. an explicit cast, a load from a variable, or
> > whatever else we conclude requires these semantics.  It would then be
> > straightforward to produce FMAs from this, as well as just generally
> > avoiding rounding when the doing sequences of illegal FP ops.
> > -ffast-math would imply never inserting the barriers.
> >
> > The disadvantages I see are:
> >   - there might be lots of peepholes and isel patterns that would
> > need to be taught to to look through a value barrier
> >   - the polarity of barriers is wrong, because code that lacks
> > barriers is implicitly opting in to things, so e.g. LTO could pick a
> > weak_odr function from an old tunit that lacks a barrier which a
> > fresh compile would insist on.
>
> I don't like the barrier approach because it implies that the FE must
> serialize each C expression as a distinct group of LLVM instructions.
> While it may be true that this currently happens in practice, I don't
> think we want to force it to be this way.
>
> Given the unique nature of this restriction, I think that the best way
> to do this is to model it directly: add metadata, or some instruction
> attribute, to each floating-point instruction indicating its
> 'contraction domain' (some module-unique integer will work). Only
> instructions with the same contraction domain can be contracted.
> Instructions without a contraction domain cannot be contracted. I
> realize that this is verbose, but realistically, the only way to tell
> LLVM what instructions are part of which C-language expression is to
> tag each relevant instruction.
>
>  -Hal
>
> >
> > John.
>
>
> --
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120605/eba2641c/attachment.html>


More information about the llvm-commits mailing list