[PATCH] D18751: [MachineCombiner] Support for floating-point FMA on ARM64
Gerolf Hoflehner via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 4 13:26:26 PDT 2016
Hi James,
sure, sorry I missed that. I looked at this too long, I guess :-). It is principally the same ‘better ILP' story as for integers. The prototypical idea is this: imagine two fmul operands feeding the fadd. When the two fmul can execute in parallel it can be faster to issue fmul, fmul, fadd rather than fmul, fmadd.
Cheers
Gerolf
> On Apr 4, 2016, at 4:58 AM, James Molloy <james.molloy at arm.com> wrote:
>
> jmolloy added a subscriber: jmolloy.
> jmolloy added a comment.
>
> Hi Gerolf,
>
> At a high level, could you please explain in what situations you expect *not* combining FMUL+FADD->FMA is a benefit? They use the same resource types on every chip I know of, and FMA is shorter in latency in every chip I know of than FMUL+FADD.
>
> Cheers,
>
> James
>
>
> ================
> Comment at: include/llvm/CodeGen/MachineCombinerPattern.h:42
> @@ +41,3 @@
> + MULSUBXI_OP1,
> + // Floating Point
> + FMULADDS_OP1,
> ----------------
> For the future: the pattern list is starting to grow quite large. I wonder if in the future we should consider moving the MachineCombinerPatterns to be table-generated?
>
>
> http://reviews.llvm.org/D18751
>
>
>
More information about the llvm-commits
mailing list