[PATCH] D149260: [AArch64] Emit FNMADD instead of FNEG(FMADD)

Wed May 3 03:45:38 PDT 2023

david-arm accepted this revision.
david-arm added a comment.
This revision is now accepted and ready to land.

LGTM with the nit addressed! Thanks @MattDevereau.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5437
+  case AArch64::FNEGDr:
+    Found |= Match(AArch64::FMADDDrrr, MachineCombinerPattern::FNMADDD);
+    break;
----------------
david-arm wrote:
> craig.topper wrote:
> > craig.topper wrote:
> > > MattDevereau wrote:
> > > > MattDevereau wrote:
> > > > > david-arm wrote:
> > > > > > Shouldn't we also be doing FMSUB here too, to ensure we are also requiring the nsz flag for the existing fnmsub combine?
> > > > > I don't think FNEG(FMSUB) and FNMSUB are equivalent, so we can't assume FNMSUB from a FNEG perspective like we can with FNMAD:
> > > > > 
> > > > > FNMSUB = (a * b) - c                    //e.g. (12 * 4) - 8 = 32
> > > > > FNEG(FMSUB) = -(-(a*b) + c)        //e.g. -(12 * 4) + 8 = 40
> > > > Furthering on from this FNMSUB and FNEG(FMSUB) are equal (oops, calcuator misuse) but there's already patterns that exist for emitting FNMSUBs. Adding tests similar to the ones added in this patch don't generate FNEG(FMSUB) but instead correctly emit FNMSUB already.
> > > They may not be equal for cases where combinations where a*b==0.0 or -0.0 and c==0.0 or -0.0.
> > > 
> > > ```
> > > fneg(fadd A, B) != fadd(fneg(A), fneg(B)) if (A==0.0 and B==-0.0) or (A==-0.0 and B=0.0)
> > > 
> > > fneg(fadd 0.0, -0.0) -> fneg(0.0) -> -0.0
> > > fadd(fneg(0.0), fneg(-0.0)) -> fadd(-0.0, 0.0) -> 0.0
> > > ```
> > > 
> > > This is why I asked for the no signed zeros check.
> > I guess its's really
> > 
> > fneg(fadd A,B) != fadd(fneg(A), fneg(B)) if (A==-B)
> Hi @craig.topper  in this case I'm really surprised that we don't check for nsz for fnmsub combines:
> 
> ```; Don't combine: Missing nsz
> define void @fnmsubs_contract(ptr %a, ptr %b, ptr %c) {
> entry:
>   %0 = load float, ptr %a, align 4
>   %1 = load float, ptr %b, align 4
>   %mul = fmul contract float %1, %0
>   %2 = load float, ptr %c, align 4
>   %add = fsub contract float %mul, %2
>   %fneg = fneg contract float %add
>   store float %fneg, ptr %a, align 4
>   ret void
> }```
> 
> I tested this without @MattDevereau's patch and it combines fine to `fnmsub` so it clearly doesn't need `nsz`. This suggests to me either:
> 
> 1. We don't need the `nsz` flag for the fnmadd combine, or
> 2. For some reason the fnmsub combine doesn't need the `nsz` flag, or
> 3. There is also a bug with the fnmsub combine that needs fixing.
> 
> Do you have any idea which of the above this is? There is no real harm in @MattDevereau adding the `nsz` check in this patch, but it would be good to know if it's actually necessary or whether there is an existing bug with fnmsub.
Hi @craig.topper @MattDevereau, I think I understand now. FNMSUB does the opposite of what I expected, which was `-((a * b) - c)` - it actually does `(a * b) - c`. I missed the fact that in the `fnmsubs_contract` example I gave above we actually generate `fnmsub` followed by `fneg`. When you add the extra `nsz` flag this contracts to `fmsub`!!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149260/new/

https://reviews.llvm.org/D149260