[llvm-dev] AArch64 fmul/fadd fusion

Joerg Sonnenberger via llvm-dev llvm-dev at lists.llvm.org
Sun Sep 20 14:36:25 PDT 2015


On Fri, Sep 18, 2015 at 11:18:49PM -0500, Meador Inge via llvm-dev wrote:
> On Fri, Sep 18, 2015 at 10:34 PM, Tim Northover <t.p.northover at gmail.com> wrote:
> 
> > AArch64's fmadd instruction is fused, which means it can produce a
> > different result to the two operations executed separately. The C and
> > C++ standards do not allow such changes.
> 
> Sorry, sloppy language on my part.  I was aware of fmadd, but I was
> really asking about turning sequences like:
> 
>   fmul s0, s0, s2
>   fadd s0, s1, s0
> 
> into a fmadd:
> 
>   fmadd s0, s0, s2, s1

...which is exactly the transform Tim is talking about. FMA is required
to provide a correctly rounded result *without* intermediate rounding.
The mul+add sequence on other the side are required to perform that
rounding. There are algorithms known to break under such optimisations,
which is why it is not enabled by default. The logic for producing FMA
is not target specific otherwise.

Joerg


More information about the llvm-dev mailing list