[PATCH] D149260: [AArch64] Emit FNMADD instead of FNEG(FMADD)
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 3 01:53:18 PDT 2023
david-arm added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5437
+ case AArch64::FNEGDr:
+ Found |= Match(AArch64::FMADDDrrr, MachineCombinerPattern::FNMADDD);
+ break;
----------------
craig.topper wrote:
> craig.topper wrote:
> > MattDevereau wrote:
> > > MattDevereau wrote:
> > > > david-arm wrote:
> > > > > Shouldn't we also be doing FMSUB here too, to ensure we are also requiring the nsz flag for the existing fnmsub combine?
> > > > I don't think FNEG(FMSUB) and FNMSUB are equivalent, so we can't assume FNMSUB from a FNEG perspective like we can with FNMAD:
> > > >
> > > > FNMSUB = (a * b) - c //e.g. (12 * 4) - 8 = 32
> > > > FNEG(FMSUB) = -(-(a*b) + c) //e.g. -(12 * 4) + 8 = 40
> > > Furthering on from this FNMSUB and FNEG(FMSUB) are equal (oops, calcuator misuse) but there's already patterns that exist for emitting FNMSUBs. Adding tests similar to the ones added in this patch don't generate FNEG(FMSUB) but instead correctly emit FNMSUB already.
> > They may not be equal for cases where combinations where a*b==0.0 or -0.0 and c==0.0 or -0.0.
> >
> > ```
> > fneg(fadd A, B) != fadd(fneg(A), fneg(B)) if (A==0.0 and B==-0.0) or (A==-0.0 and B=0.0)
> >
> > fneg(fadd 0.0, -0.0) -> fneg(0.0) -> -0.0
> > fadd(fneg(0.0), fneg(-0.0)) -> fadd(-0.0, 0.0) -> 0.0
> > ```
> >
> > This is why I asked for the no signed zeros check.
> I guess its's really
>
> fneg(fadd A,B) != fadd(fneg(A), fneg(B)) if (A==-B)
Hi @craig.topper in this case I'm really surprised that we don't check for nsz for fnmsub combines:
```; Don't combine: Missing nsz
define void @fnmsubs_contract(ptr %a, ptr %b, ptr %c) {
entry:
%0 = load float, ptr %a, align 4
%1 = load float, ptr %b, align 4
%mul = fmul contract float %1, %0
%2 = load float, ptr %c, align 4
%add = fsub contract float %mul, %2
%fneg = fneg contract float %add
store float %fneg, ptr %a, align 4
ret void
}```
I tested this without @MattDevereau's patch and it combines fine to `fnmsub` so it clearly doesn't need `nsz`. This suggests to me either:
1. We don't need the `nsz` flag for the fnmadd combine, or
2. For some reason the fnmsub combine doesn't need the `nsz` flag, or
3. There is also a bug with the fnmsub combine that needs fixing.
Do you have any idea which of the above this is? There is no real harm in @MattDevereau adding the `nsz` check in this patch, but it would be good to know if it's actually necessary or whether there is an existing bug with fnmsub.
================
Comment at: llvm/test/CodeGen/AArch64/aarch64_fnmadd.ll:4
+
+define dso_local void @fnmaddd(ptr %a, ptr %b, ptr %c) {
+; CHECK-LABEL: fnmaddd:
----------------
nit: Can you remove the `dso_local` markers on all these functions please? I don't think we need them.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149260/new/
https://reviews.llvm.org/D149260
More information about the llvm-commits
mailing list