[all-commits] [llvm/llvm-project] 9aa394: [AArch64] Prefer to fold dup into fmul/fma as oppo...

David Green via All-commits all-commits at lists.llvm.org
Tue Mar 7 13:24:40 PST 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9aa39481d9eb718e872993791547053a3c1f16d5
      https://github.com/llvm/llvm-project/commit/9aa39481d9eb718e872993791547053a3c1f16d5
  Author: David Green <david.green at arm.com>
  Date:   2023-03-07 (Tue, 07 Mar 2023)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/test/CodeGen/AArch64/ld1postmul.ll

  Log Message:
  -----------
  [AArch64] Prefer to fold dup into fmul/fma as opposed to ld1r

There is a fold to create LD1DUPpost from dup(load) that can be postinc. If the
dup is used by a "by element" operation such as fmul or fma then it can be
slightly better to fold the dup into the fmul instead, which produces slightly
fast code.

  ld1r { v1.4s }, [x0], #4
  fmul v0.4s, v1.4s, v0.4s
vs
  ldr s1, [x0], #4
  fmul v0.4s, v0.4s, v1.s[0]

This could also be done with integer operations such as smull/umull too, so
long as the load/dup gets correctly combined into the mul operation. Currently
this just operates on foating point types.

Differential Revision: https://reviews.llvm.org/D145184




More information about the All-commits mailing list