[all-commits] [llvm/llvm-project] 72105d: [AArch64] Avoid using intermediate integer registe...

Mon Feb 27 12:25:42 PST 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 72105d10d5296ac175eb1339c4f71b67905fde61
      https://github.com/llvm/llvm-project/commit/72105d10d5296ac175eb1339c4f71b67905fde61
  Author: Nilanjana Basu <n_basu at apple.com>
  Date:   2023-02-27 (Mon, 27 Feb 2023)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp
    M llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll
    M llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
    M llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll
    M llvm/test/CodeGen/AArch64/neon-extracttruncate.ll
    M llvm/test/CodeGen/AArch64/peephole-insvigpr.mir
    M llvm/test/CodeGen/AArch64/sve-fixed-length-masked-gather.ll
    M llvm/test/CodeGen/AArch64/sve-fixed-length-masked-loads.ll
    M llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll
    M llvm/test/CodeGen/AArch64/sve-fixed-length-masked-stores.ll

  Log Message:
  -----------
  [AArch64] Avoid using intermediate integer registers for copying between source and destination floating point registers

In post-isel code, there are cases where there were redundant copies from a source FPR to an intermediate GPR in order to copy to a destination FPR. In this patch, we identify these patterns in post-isel peephole optimization and replace them with a direct FPR-to-FPR copy.
One example for this will be the insertion of the scalar result of 'uaddlv' neon intrinsic function into a destination vector. During instruction selection phase, 'uaddlv' result is copied to a GPR, & a vector insert instruction is matched separately to copy the previous result to a destination SIMD&FP register.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D142594