[PATCH] D23583: [AArch64] Add feature has-fast-fma
Evandro Menezes via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 17 08:45:51 PDT 2016
evandro added a comment.
In https://reviews.llvm.org/D23583#517938, @jmolloy wrote:
> Hi,
>
> I also have concerns here. The TargetLowering hook states:
>
> /// Return true if target always beneficiates from combining into FMA for a
> /// given value type. This must typically return false on targets where FMA
> /// takes more cycles to execute than *FADD*.
>
>
> Whereas you say:
>
> In spite of what the original author intended, I observed that the extra folds are worth it if FMA is as quick *FMUL* instead.
>
>
> Which is correct? Or are you using this hook in a way the hook users don't intend? The wording used is vague and I really think we need to have more detail about what property of Exynos-M1 makes this good for Exynos but not for any other microarchitecture.
First off, all I can say in this patch is that this feature is beneficial to Exynos M1. I do not know whether it's beneficial or not in other targets.
However, if you look at the foldings in `DAGCompiner.cpp` performed only if `enableAggressiveFMAFusion()` is true you will find that `FMUL` is the pivot of the folding, not `FADD`, in spite of what the comments states. For instance, whether it should fold into FMA twice, back to back.
As a matter of fact, it seems to me that most "big" targets would benefit from folding into FMA more aggressively.
Repository:
rL LLVM
https://reviews.llvm.org/D23583
More information about the llvm-commits
mailing list