[PATCH] D67990: [aarch64] fix generation of fp16 fmls
Sebastian Pop via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 24 15:12:18 PDT 2019
sebpop created this revision.
sebpop added reviewers: t.p.northover, SjoerdMeijer.
Herald added subscribers: hiraditya, kristof.beyls.
Herald added a project: LLVM.
Tim remarked that the added patterns produce wrong code in case the fsub
instruction has a multiplication as its first operand, i.e., all the patterns FMLSv*_OP1:
> define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) {
> ; CHECK-LABEL: test_FMLSv8f16_OP1:
> ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h
> entry:
>
> %mul = fmul fast <8 x half> %c, %b
> %sub = fsub fast <8 x half> %mul, %a
> ret <8 x half> %sub
>
> }
>
> This doesn't look right to me. The exact instruction produced is "fmls
> v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2*v1", but the
> IR is calculating "v2*v1-v0". The equivalent <4 x float> code also
> doesn't emit an fmls.
This patch generates an fmla and negates the value of the operand2 of the fsub.
Inspecting the pattern match, I found that there was another mistake in the
opcode to be selected: matching FMULv4*16 should generate FMLSv4*16
and not FMLSv2*32.
Tested on aarch64-linux with make check-all.
https://reviews.llvm.org/D67990
Files:
llvm/include/llvm/CodeGen/MachineCombinerPattern.h
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
llvm/test/CodeGen/AArch64/fp16-fmla.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D67990.221606.patch
Type: text/x-patch
Size: 6279 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190924/44cb3e73/attachment.bin>
More information about the llvm-commits
mailing list