[all-commits] [llvm/llvm-project] c9a87a: [SLPVectorizer] Use accurate cost for external use...
Jeffrey Byrnes via All-commits
all-commits at lists.llvm.org
Tue Jun 17 08:14:26 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: c9a87a50aee3c91f36d33c170d5131bcc370c289
https://github.com/llvm/llvm-project/commit/c9a87a50aee3c91f36d33c170d5131bcc370c289
Author: Jeffrey Byrnes <jeffrey.byrnes at amd.com>
Date: 2025-06-17 (Tue, 17 Jun 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
M llvm/test/Transforms/SLPVectorizer/AMDGPU/external-shuffle.ll
M llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
M llvm/test/Transforms/SLPVectorizer/X86/vec_list_bias-inseltpoison.ll
Log Message:
-----------
[SLPVectorizer] Use accurate cost for external users of resize shuffles (#137419)
When implementing the vectorization, we potentially need to add shuffles
for external users. In such cases, we may be shuffling a smaller vector
into a larger vector. When this happens `ResizeToVF` will just build a
poison padded identity vector. Then the to build the final shuffle, we
just use the `SK_InsertSubvector` mask.
This is possibly clearer by looking at the included test in
SLPVectorizer/AMDGPU/external-shuffle.ll
In the exit block we have a bunch of shuffles to glue the vectorized
tree match the `InsertElement` users. `TMP25` holds the result of
resizing the v2i16 vectorized sequence to match the `InsertElement` size
v16i16. Then `TMP26` is the final shuffle which replaces the
`InsertElement` sequence. This is just an insertsubvector.
However, when calculating the cost for these shuffles, we aren't
modelling this correctly. `ResizeToVF` will indicate to
`performExtractsShuffleAction` that we cannot use the original mask due
to the resize shuffle. The consequence is that the cost calculation uses
a different shuffle mask than what is ultimately used.
Going back to the included test, we can consider again `TMP26`. Clearly
we can see the shuffle uses a mask {0, 1, 2, 3, 16, 17, poison ..}.
However, we will currently calculate the cost with a mask {0, 1, 2, 3,
20, 21, ...} we have replaced 16 and 17 with 20 and 21 (Index + Vector
Size). Queries like BasicTTImpl::improveShuffleKindFromMask will not
recognize this as an `SK_InsertSubvector` mask, and targets which have
reduced costs for `SK_InsertSubvector` will not accurately calculate the
cost.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list