[PATCH] D158008: [AArch64] Add patterns for FMADD, FMSUB
OverMighty via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Aug 15 11:17:43 PDT 2023
overmighty created this revision.
overmighty added reviewers: dmgreen, john.brawn, SjoerdMeijer.
Herald added subscribers: arphaman, hiraditya, kristof.beyls.
Herald added a project: All.
overmighty requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.
FMADD, FMSUB instructions perform better or the same compared to indexed
FMLA, FMLS.
For example, the Arm Cortex-A55 Software Optimization Guide lists "FP
multiply accumulate" FMADD, FMSUB instructions with a throughput of 2
IPC, whereas it lists "ASIMD FP multiply accumulate, by element" FMLA,
FMLS with a throughput of 1 IPC.
The Arm Cortex-A77 Software Optimization Guide, however, does not
separately list "by element" variants of the "ASIMD FP multiply
accumulate" instructions, which are listed with the same throughput of 2
IPC as "FP multiply accumulate" instructions.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D158008
Files:
clang/test/CodeGen/aarch64-neon-scalar-x-indexed-elem-constrained.c
llvm/lib/Target/AArch64/AArch64InstrFormats.td
llvm/test/CodeGen/AArch64/complex-deinterleaving-f16-mul.ll
llvm/test/CodeGen/AArch64/fp16_intrinsic_lane.ll
llvm/test/CodeGen/AArch64/neon-scalar-by-elem-fma.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D158008.550401.patch
Type: text/x-patch
Size: 25827 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230815/08a47663/attachment-0001.bin>
More information about the cfe-commits
mailing list