[all-commits] [llvm/llvm-project] 100cb9: [VectorCombine] Fold shuffle select pattern

Fri May 6 00:13:31 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 100cb9a2ba9e30f3a2c058f349663271afc869e4
      https://github.com/llvm/llvm-project/commit/100cb9a2ba9e30f3a2c058f349663271afc869e4
  Author: David Green <david.green at arm.com>
  Date:   2022-05-06 (Fri, 06 May 2022)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
    M llvm/test/Transforms/VectorCombine/AArch64/select-shuffle.ll

  Log Message:
  -----------
  [VectorCombine] Fold shuffle select pattern

This patch adds a combine to attempt to reduce the costs of certain
select-shuffle patterns. The form of code it attempts to detect is:
  %x = shuffle ...
  %y = shuffle ...
  %a = binop %x, %y
  %b = binop %x, %y
  shuffle %a, %b, selectmask

A classic select-mask will pick items from each lane of a or b. These
do not always have a great lowering on many architectures. This patch
attempts to pack a and b into the lower elements, creating a differently
ordered shuffle for reconstructing the orignal which may be better than
the select mask. This can be better for performance, especially if less
elements of a and b need to be computed and the input shuffles are
cheaper.

Because select-masks are just one form of shuffle, we generalize to any
mask. So long as the backend has decent costmodel for the shuffles, this
can generally improve things when they come up. For more basic cost
models the folds do not appear to be profitable, not getting past the
cost checks.

Differential Revision: https://reviews.llvm.org/D123911