[PATCH] D23583: [AArch64] Add feature has-fast-fma

Wed Aug 17 08:45:51 PDT 2016

evandro added a comment.

In https://reviews.llvm.org/D23583#517938, @jmolloy wrote:

> Hi,
>
> I also have concerns here. The TargetLowering hook states:
>
>   /// Return true if target always beneficiates from combining into FMA for a
>   /// given value type. This must typically return false on targets where FMA
>   /// takes more cycles to execute than *FADD*.
>   
>
> Whereas you say:
>
>   In spite of what the original author intended, I observed that the extra folds are worth it if FMA is as quick *FMUL* instead.
>   
>
> Which is correct? Or are you using this hook in a way the hook users don't intend? The wording used is vague and I really think we need to have more detail about what property of Exynos-M1 makes this good for Exynos but not for any other microarchitecture.

First off, all I can say in this patch is that this feature is beneficial to Exynos M1.  I do not know whether it's beneficial or not in other targets.

However, if you look at the foldings in `DAGCompiner.cpp` performed only if  `enableAggressiveFMAFusion()` is true you will find that `FMUL` is the pivot of the folding, not `FADD`, in spite of what the comments states.  For instance, whether it should fold into FMA twice, back to back.

As a matter of fact, it seems to me that most "big" targets would benefit from folding into FMA more aggressively.

Repository:
  rL LLVM

https://reviews.llvm.org/D23583