[PATCH][AARCH64] Fixed fused multiply add/sub patterns

Fri Dec 20 09:44:34 PST 2013

Pinging reviewers.

I guess everyone is ok with the bug fix for fnmadd/fnmsub.

Do we all also agree with the patterns with multiply operation single use to
avoid performance issues when generating fused instructions?

Let me know and I will merge both changes.

Thanks,

Ana.

From: llvm-commits-bounces at cs.uiuc.edu
[mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Ana Pazos
Sent: Wednesday, December 18, 2013 11:57 AM
To: llvm-commits at cs.uiuc.edu; tnorthover at apple.com
Subject: [PATCH][AARCH64] Fixed fused multiply add/sub patterns

Hello Tim and reviewers,

1)      Bug issue: fnmadd and fnmsub patterns are switched in the current
code in AArch64InstInfo.td

-   fnmadd is (-Ra) + (-Rn)*Rm  which should be matched as:

                fma (fneg node:$Rn),  node:$Rm, (fneg node:$Ra) and as

                (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul_su FPR32:$Rn,
FPR32:$Rm))))

-   fnmsub is (-Ra) + Rn*Rm which should be matched as 

     fma node:$Rn,  node:$Rm, (fneg node:$Ra) and as

                (f32 (fsub (f32 (fmul_su FPR32:$Rn, FPR32:$Rm)),
FPR32:$Ra))))

2)      Performance issue: In the current code we allow matching patterns
like (fadd(fmul)) to create fused multiply add/sub instructions.

In ARM we saw that this caused the multiply operation to be repeated many
times which affected performance.

So in ARM the pattern is only matched if fmul has a single use.

AArch64 targets most probably have one MAC pipe and this will be a
performance issue as well.

Let me know if you agree with both changes.

Thanks,

Ana.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131220/63da6d33/attachment.html>