[llvm] r371321 - [aarch64] Add combine patterns for fp16 fmla
Tim Northover via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 23 06:03:41 PDT 2019
Hi Sebastian,
> define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) {
> ; CHECK-LABEL: test_FMLSv8f16_OP1:
> ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
> entry:
> %mul = fmul fast <8 x half> %c, %b
> %sub = fsub fast <8 x half> %mul, %a
> ret <8 x half> %sub
> }
This doesn't look right to me. The exact instruction produced is "fmls
v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2*v1", but the
IR is calculating "v2*v1-v0". The equivalent <4 x float> code also
doesn't emit an fmls.
Cheers.
Tim.
More information about the llvm-commits
mailing list