[LLVMdev] FPOpFusion = Fast and Multiply-and-add combines

Thu Jul 31 08:50:45 PDT 2014

Hi Tim,

Thanks for the thorough explanation. It makes perfect sense.

I was not aware fast-math is supposed to prevent more precision being used
than what is in the standard.

I came across this issue while looking into the output or different
compilers. XL and Microsoft compiler seem
to have that turned on by default. But I assume that clang follows what gcc
does, and have that turned off.

Thanks again,
Samuel

Tim Northover <t.p.northover at gmail.com> wrote on 07/31/2014 09:54:55 AM:

> From: Tim Northover <t.p.northover at gmail.com>
> To: Samuel F Antao/Watson/IBM at IBMUS
> Cc: "llvmdev at cs.uiuc.edu" <llvmdev at cs.uiuc.edu>, Olivier H
> Sallenave/Watson/IBM at IBMUS
> Date: 07/31/2014 09:55 AM
> Subject: Re: [LLVMdev] FPOpFusion = Fast and Multiply-and-add combines
>
> Hi Samuel,
>
> On 30 July 2014 22:37, Samuel F Antao <sfantao at us.ibm.com> wrote:
> > In the DAGCombiner, during the combination of mul and add/subtract into
> > multiply-and-add/subtract, this option is expected to be Fast in order
to
> > enable the combine. This means, that by default no multiply-and-add
opcodes
> > are going to be generated. If I understand it correctly, this is
undesirable
> > given that multiply-and-add for targets like PPC (I am not sure about
all
> > the other targets) does not pose any rounding problem and it can even
be
> > more accurate than performing the two operations separately.
>
> That extra precision is actually what we're being very careful to
> avoid unless specifically told we're allowed. It can be just as
> harmful to carefully written floating-point code as dropping precision
> would be.
>
> > Also, in TargetOptions.h I read:
> >
> > Standard, // Only allow fusion of 'blessed' ops (currently just
fmuladd)
> >
> > which made me suspect that the check against Fast in the DAGCombiner is
not
> > correct.
>
> I think it's OK. In the IR there are 3 different ways to express mul +
add:
>
> 1. fmul + fadd. This must not be fused into a single step without
> intermediate rounding (unless we're in Fast mode).
> 2. call @llvm.fmuladd. This *may* be fused or not, depending on
> profitability (unless we're in Strict mode, in which case it's
> separate).
> 3. call @llvm.fma. This must not be split into two operations (unless
> we're in Fast mode).
>
> That middle one is there because C actually allows you to allow &
> disallow contraction within a limited region with "#pragma STDC
> FP_CONTRACT ON". So we need a way to represent the idea that it's not
> usually OK to fuse them (i.e. not Fast mode), but this particular one
> actually is OK.
>
> Cheers.
>
> Tim.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140731/92c5be11/attachment.html>