[all-commits] [llvm/llvm-project] fad69a: [Analysis][SVE] Improve cost model for some extend...

Mon Oct 2 02:51:17 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fad69a500998d3db937cff82361151a1b82cf865
      https://github.com/llvm/llvm-project/commit/fad69a500998d3db937cff82361151a1b82cf865
  Author: David Sherwood <57997763+david-arm at users.noreply.github.com>
  Date:   2023-10-02 (Mon, 02 Oct 2023)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll
    M llvm/test/Analysis/CostModel/AArch64/sve-ext.ll
    M llvm/test/Analysis/CostModel/AArch64/sve-ldst.ll

  Log Message:
  -----------
  [Analysis][SVE] Improve cost model for some extending masked loads (#65957)

When performing a masked load of an unpacked SVE vector type, i.e.
nxv8i8, followed by a zero- or sign-extend to an illegal wide type
such as nxv8i32 we typically end up with a combination of an
extending masked load and pair(s) of uunpklo/hi or sunpklo/hi
instructions. For example, see test @masked_sload_8i8_8i32 in file

  CodeGen/AArch64/sve-masked-ldst-sext.ll

where

  %aval = call <vscale x 8 x i8> @llvm.masked.load.nxv8i8(...
  %aext = sext <vscale x 8 x i8> %aval to <vscale x 8 x i32>

gets lowered to

  ld1sb { z1.h }, ...
  sunpklo z0.s, z1.h
  sunpkhi z1.s, z1.h

Currently the cost for the 'sext' operation in the example above is
1, whereas this patch changes it to 2 to reflect the pair of
instructions required. Similarly, when doing a masked load of a
nxv8i8 and extending to nxv8i64 the cost is changed to 6 to reflect
the 6 unpacks required.