[llvm] r371321 - [aarch64] Add combine patterns for fp16 fmla

Mon Sep 23 06:03:41 PDT 2019

Hi Sebastian,

> define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) {
> ; CHECK-LABEL: test_FMLSv8f16_OP1:
> ; CHECK: fmls    {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
> entry:
>   %mul = fmul fast <8 x half> %c, %b
>   %sub = fsub fast <8 x half> %mul, %a
>   ret <8 x half> %sub
> }

This doesn't look right to me. The exact instruction produced is "fmls
v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2*v1", but the
IR is calculating "v2*v1-v0". The equivalent <4 x float> code also
doesn't emit an fmls.

Cheers.

Tim.