[PATCH] D158008: [AArch64] Add patterns for FMADD, FMSUB

OverMighty via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 15 11:17:43 PDT 2023


overmighty created this revision.
overmighty added reviewers: dmgreen, john.brawn, SjoerdMeijer.
Herald added subscribers: arphaman, hiraditya, kristof.beyls.
Herald added a project: All.
overmighty requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

FMADD, FMSUB instructions perform better or the same compared to indexed
FMLA, FMLS.

For example, the Arm Cortex-A55 Software Optimization Guide lists "FP
multiply accumulate" FMADD, FMSUB instructions with a throughput of 2
IPC, whereas it lists "ASIMD FP multiply accumulate, by element" FMLA,
FMLS with a throughput of 1 IPC.

The Arm Cortex-A77 Software Optimization Guide, however, does not
separately list "by element" variants of the "ASIMD FP multiply
accumulate" instructions, which are listed with the same throughput of 2
IPC as "FP multiply accumulate" instructions.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D158008

Files:
  clang/test/CodeGen/aarch64-neon-scalar-x-indexed-elem-constrained.c
  llvm/lib/Target/AArch64/AArch64InstrFormats.td
  llvm/test/CodeGen/AArch64/complex-deinterleaving-f16-mul.ll
  llvm/test/CodeGen/AArch64/fp16_intrinsic_lane.ll
  llvm/test/CodeGen/AArch64/neon-scalar-by-elem-fma.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D158008.550401.patch
Type: text/x-patch
Size: 25827 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230815/08a47663/attachment-0001.bin>


More information about the cfe-commits mailing list