[all-commits] [llvm/llvm-project] 4a3776: [AArch64][NFC] Precommit test case to show sub-opt...

Sun Oct 9 17:38:07 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4a377653c2a29ea930936adfae194996dbf22113
      https://github.com/llvm/llvm-project/commit/4a377653c2a29ea930936adfae194996dbf22113
  Author: Mingming Liu <mingmingl at google.com>
  Date:   2022-10-09 (Sun, 09 Oct 2022)

  Changed paths:
    M llvm/test/CodeGen/AArch64/logical_shifted_reg.ll

  Log Message:
  -----------
  [AArch64][NFC] Precommit test case to show sub-optimal codegen for add(lsl(val1,small-shift), lsl(val2,large-shift)).

Ideally, add operand with smaller shift should be RHS. In that way, smaller-shift is folded into ADD.
- Also add another test case when 'lsl(val1,small-shift)' has one than one use, to show the (planned) optimization won't regress this case.

  Commit: 159fb378f779ac79f7d456ea233892ad526b56d8
      https://github.com/llvm/llvm-project/commit/159fb378f779ac79f7d456ea233892ad526b56d8
  Author: Mingming Liu <mingmingl at google.com>
  Date:   2022-10-09 (Sun, 09 Oct 2022)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/test/CodeGen/AArch64/logical_shifted_reg.ll

  Log Message:
  -----------
  [AArch64] Swap 'lsl(val1,small-shmt)' to right hand side for AND(lsl(val1,small-shmt), lsl(val2,large-shmt))

On many aarch64 processors (Cortex A78, Neoverse N1/N2/V1, etc), ADD with LSL shift (shift-amount <= 4) has smaller latency and higher
throughput than ADD with larger shift (shift-amunt > 4). This is at least no-op for the rest of the processors.

Differential Revision: https://reviews.llvm.org/D135208

Compare: https://github.com/llvm/llvm-project/compare/fee8f561bdc9...159fb378f779