[PATCH] D67990: [aarch64] fix generation of fp16 fmls

Tue Oct 8 01:35:53 PDT 2019

SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

Cheers, lgtm

================
Comment at: llvm/test/CodeGen/AArch64/fp16-fmla.ll:163
+; CHECK: fneg    {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
+; CHECK: fmla    {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
 entry:
----------------
sebpop wrote:
> SjoerdMeijer wrote:
> > Why are we not generating a fmls?
> > 
> > And a nit, but perhaps actually just using registers v0, v1, and v2 here makes things clearer?
> That is part of the problem that Tim pointed out: when the multiply is the first operand of `fsub`, i.e.,
> ```
> %sub = fsub fast <8 x half> %mul, %a
> ```
> that should not generate a fused multiply sub.
> With this patch, for `b * c - a` we negate the value of a and generate a fused multiply add `-a + b * c`.
> 
> 
Thanks, I just got myself confused here.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67990/new/

https://reviews.llvm.org/D67990