[all-commits] [llvm/llvm-project] c9a87a: [SLPVectorizer] Use accurate cost for external use...

Jeffrey Byrnes via All-commits all-commits at lists.llvm.org
Tue Jun 17 08:14:26 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: c9a87a50aee3c91f36d33c170d5131bcc370c289
      https://github.com/llvm/llvm-project/commit/c9a87a50aee3c91f36d33c170d5131bcc370c289
  Author: Jeffrey Byrnes <jeffrey.byrnes at amd.com>
  Date:   2025-06-17 (Tue, 17 Jun 2025)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/test/Transforms/SLPVectorizer/AMDGPU/external-shuffle.ll
    M llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
    M llvm/test/Transforms/SLPVectorizer/X86/vec_list_bias-inseltpoison.ll

  Log Message:
  -----------
  [SLPVectorizer] Use accurate cost for external users of resize shuffles (#137419)

When implementing the vectorization, we potentially need to add shuffles
for external users. In such cases, we may be shuffling a smaller vector
into a larger vector. When this happens `ResizeToVF` will just build a
poison padded identity vector. Then the to build the final shuffle, we
just use the `SK_InsertSubvector` mask.

This is possibly clearer by looking at the included test in
SLPVectorizer/AMDGPU/external-shuffle.ll

In the exit block we have a bunch of shuffles to glue the vectorized
tree match the `InsertElement` users. `TMP25` holds the result of
resizing the v2i16 vectorized sequence to match the `InsertElement` size
v16i16. Then `TMP26` is the final shuffle which replaces the
`InsertElement` sequence. This is just an insertsubvector.

However, when calculating the cost for these shuffles, we aren't
modelling this correctly. `ResizeToVF` will indicate to
`performExtractsShuffleAction` that we cannot use the original mask due
to the resize shuffle. The consequence is that the cost calculation uses
a different shuffle mask than what is ultimately used.

Going back to the included test, we can consider again `TMP26`. Clearly
we can see the shuffle uses a mask {0, 1, 2, 3, 16, 17, poison ..}.
However, we will currently calculate the cost with a mask {0, 1, 2, 3,
20, 21, ...} we have replaced 16 and 17 with 20 and 21 (Index + Vector
Size). Queries like BasicTTImpl::improveShuffleKindFromMask will not
recognize this as an `SK_InsertSubvector` mask, and targets which have
reduced costs for `SK_InsertSubvector` will not accurately calculate the
cost.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list