[all-commits] [llvm/llvm-project] c83f23: [AArch64] Fix heuristics for folding "lsl" into lo...

Thu Apr 4 11:26:06 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: c83f23d6abb6f8d693c643bc1b43f9b9e06bc537
      https://github.com/llvm/llvm-project/commit/c83f23d6abb6f8d693c643bc1b43f9b9e06bc537
  Author: Eli Friedman <efriedma at quicinc.com>
  Date:   2024-04-04 (Thu, 04 Apr 2024)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64.td
    M llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
    M llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
    M llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
    M llvm/test/CodeGen/AArch64/GlobalISel/load-addressing-modes.mir
    M llvm/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
    M llvm/test/CodeGen/AArch64/aarch64-split-and-bitmask-immediate.ll
    M llvm/test/CodeGen/AArch64/arm64-addr-mode-folding.ll
    M llvm/test/CodeGen/AArch64/arm64-vector-ldst.ll
    M llvm/test/CodeGen/AArch64/avoid-free-ext-promotion.ll
    M llvm/test/CodeGen/AArch64/cheap-as-a-move.ll
    M llvm/test/CodeGen/AArch64/extract-bits.ll
    M llvm/test/CodeGen/AArch64/machine-licm-hoist-load.ll
    M llvm/test/CodeGen/AArch64/sink-and-fold.ll

  Log Message:
  -----------
  [AArch64] Fix heuristics for folding "lsl" into load/store ops. (#86894)

The existing heuristics were assuming that every core behaves like an
Apple A7, where any extend/shift costs an extra micro-op... but in
reality, nothing else behaves like that.

On some older Cortex designs, shifts by 1 or 4 cost extra, but all other
shifts/extensions are free. On all other cores, as far as I can tell,
all shifts/extensions for integer loads are free (i.e. the same cost as
an unshifted load).

To reflect this, this patch:

- Enables aggressive folding of shifts into loads by default.

- Removes the old AddrLSLFast feature, since it applies to everything
except A7 (and even if you are explicitly targeting A7, we want to
assume extensions are free because the code will almost always run on a
newer core).

- Adds a new feature AddrLSLSlow14 that applies specifically to the
Cortex cores where shifts by 1 or 4 cost extra.

I didn't add support for AddrLSLSlow14 on the GlobalISel side because it
would require a bunch of refactoring to work correctly. Someone can pick
this up as a followup.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications