[all-commits] [llvm/llvm-project] 09f4ce: [AArch64] Codegen tests for fold from D153972. NFC

David Green via All-commits all-commits at lists.llvm.org
Fri Jun 30 04:25:21 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 09f4cedd6189a2ab9464b777ecc8e10610a7ff2c
      https://github.com/llvm/llvm-project/commit/09f4cedd6189a2ab9464b777ecc8e10610a7ff2c
  Author: David Green <david.green at arm.com>
  Date:   2023-06-30 (Fri, 30 Jun 2023)

  Changed paths:
    A llvm/test/CodeGen/AArch64/extbinopload.ll

  Log Message:
  -----------
  [AArch64] Codegen tests for fold from D153972. NFC


  Commit: d36c81e7f6f09a46c802d9b64416c24253140e25
      https://github.com/llvm/llvm-project/commit/d36c81e7f6f09a46c802d9b64416c24253140e25
  Author: David Green <david.green at arm.com>
  Date:   2023-06-30 (Fri, 30 Jun 2023)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/test/CodeGen/AArch64/extbinopload.ll
    M llvm/test/CodeGen/AArch64/insert-extend.ll
    M llvm/test/CodeGen/AArch64/reduce-shuffle.ll

  Log Message:
  -----------
  [AArch64] Fold tree of offset loads combine

This attempts to fold trees of add(ext(load p), shl(ext(load p+4)) into a
single load of twice the size, that we extract the bottom part and top part so
that the shl can start to use a shll2 instruction. The two loads in that
example can also be larger trees of instructions, which are identical except
for the leaves which are all loads offset from the LHS, including buildvectors
of multiple loads. For example:
sub(zext(buildvec(load p+4, load q+4)), zext(buildvec(load r+4, load s+4)))

Whilst it can be common for the larger loads to replace LDP instructions (which
doesn't gain anything on its own), the larger loads in buildvectors can help
create more efficient code, and prevent the need for ld1 lane inserts which can
be more expensive than continuous loads.

This creates a fairly niche, fairly large combine that attempts to be fairly
general where it is beneficial. It helps some SLP vectorized code to avoid the
use of the more expensive ld1 lane inserting loads.

Differential Revision: https://reviews.llvm.org/D153972


Compare: https://github.com/llvm/llvm-project/compare/9078a9942d54...d36c81e7f6f0


More information about the All-commits mailing list