[PATCH] D67990: [aarch64] fix generation of fp16 fmls
Sjoerd Meijer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 8 01:35:53 PDT 2019
SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.
Cheers, lgtm
================
Comment at: llvm/test/CodeGen/AArch64/fp16-fmla.ll:163
+; CHECK: fneg {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
+; CHECK: fmla {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
entry:
----------------
sebpop wrote:
> SjoerdMeijer wrote:
> > Why are we not generating a fmls?
> >
> > And a nit, but perhaps actually just using registers v0, v1, and v2 here makes things clearer?
> That is part of the problem that Tim pointed out: when the multiply is the first operand of `fsub`, i.e.,
> ```
> %sub = fsub fast <8 x half> %mul, %a
> ```
> that should not generate a fused multiply sub.
> With this patch, for `b * c - a` we negate the value of a and generate a fused multiply add `-a + b * c`.
>
>
Thanks, I just got myself confused here.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D67990/new/
https://reviews.llvm.org/D67990
More information about the llvm-commits
mailing list