[all-commits] [llvm/llvm-project] 64094e: [DAGCombiner] Pre-commit tests for D159191

Tue Sep 5 03:46:33 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 64094e3e6df223cde2861b89c31bb529bb36f8f7
      https://github.com/llvm/llvm-project/commit/64094e3e6df223cde2861b89c31bb529bb36f8f7
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2023-09-05 (Tue, 05 Sep 2023)

  Changed paths:
    M llvm/test/CodeGen/AArch64/sve-intrinsics-ldst-ext.ll
    M llvm/test/CodeGen/AArch64/sve-masked-ldst-sext.ll
    M llvm/test/CodeGen/AArch64/sve-masked-ldst-zext.ll
    M llvm/test/CodeGen/AArch64/sve-sext-zext.ll

  Log Message:
  -----------
  [DAGCombiner] Pre-commit tests for D159191

I've added some missing tests for the following cases:

1. Zero- and sign-extends from unpacked vector types to wide,
   illegal types. For example,
   %aext = zext <vscale x 4 x i8> %a to <vscale x 4 x i64>
2. Normal loads combined with 1
3. Masked loads combined with 1

Differential Revision: https://reviews.llvm.org/D159192

  Commit: 50598f0ff44f3a4e75706f8c53f3380fe7faa896
      https://github.com/llvm/llvm-project/commit/50598f0ff44f3a4e75706f8c53f3380fe7faa896
  Author: David Sherwood <david.sherwood at arm.com>
  Date:   2023-09-05 (Tue, 05 Sep 2023)

  Changed paths:
    M llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/test/CodeGen/AArch64/sve-masked-ldst-sext.ll
    M llvm/test/CodeGen/AArch64/sve-masked-ldst-zext.ll

  Log Message:
  -----------
  [DAGCombiner][SVE] Add support for illegal extending masked loads

In some cases where the same mask is used for multiple
extending masked loads it can be more efficient to combine
the zero- or sign-extend into the load even if it's not a
legal or custom operation. This leads to splitting up the
extending load into smaller parts, which also requires
splitting the mask. For SVE at least this improves the
performance of the SPEC benchmark x264 slightly on
neoverse-v1 (~0.3%), and at least one other benchmark
improves by around 30%. The uplift for SVE seems due to
removing the dependencies (vector unpacks) introduced
between the loads and the vector operations, since this
should increase the level of parallelism.

See tests:

  CodeGen/AArch64/sve-masked-ldst-sext.ll
  CodeGen/AArch64/sve-masked-ldst-zext.ll

https://reviews.llvm.org/D159191

Compare: https://github.com/llvm/llvm-project/compare/fde2b0d6dba7...50598f0ff44f