[all-commits] [llvm/llvm-project] fad69a: [Analysis][SVE] Improve cost model for some extend...
David Sherwood via All-commits
all-commits at lists.llvm.org
Mon Oct 2 02:51:17 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: fad69a500998d3db937cff82361151a1b82cf865
https://github.com/llvm/llvm-project/commit/fad69a500998d3db937cff82361151a1b82cf865
Author: David Sherwood <57997763+david-arm at users.noreply.github.com>
Date: 2023-10-02 (Mon, 02 Oct 2023)
Changed paths:
M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
M llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll
M llvm/test/Analysis/CostModel/AArch64/sve-ext.ll
M llvm/test/Analysis/CostModel/AArch64/sve-ldst.ll
Log Message:
-----------
[Analysis][SVE] Improve cost model for some extending masked loads (#65957)
When performing a masked load of an unpacked SVE vector type, i.e.
nxv8i8, followed by a zero- or sign-extend to an illegal wide type
such as nxv8i32 we typically end up with a combination of an
extending masked load and pair(s) of uunpklo/hi or sunpklo/hi
instructions. For example, see test @masked_sload_8i8_8i32 in file
CodeGen/AArch64/sve-masked-ldst-sext.ll
where
%aval = call <vscale x 8 x i8> @llvm.masked.load.nxv8i8(...
%aext = sext <vscale x 8 x i8> %aval to <vscale x 8 x i32>
gets lowered to
ld1sb { z1.h }, ...
sunpklo z0.s, z1.h
sunpkhi z1.s, z1.h
Currently the cost for the 'sext' operation in the example above is
1, whereas this patch changes it to 2 to reflect the pair of
instructions required. Similarly, when doing a masked load of a
nxv8i8 and extending to nxv8i64 the cost is changed to 6 to reflect
the 6 unpacks required.
More information about the All-commits
mailing list