[all-commits] [llvm/llvm-project] 740625: [X86] Make lowerShuffleAsLanePermuteAndPermute use...

Simon Pilgrim via All-commits all-commits at lists.llvm.org
Fri Sep 4 03:41:51 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 740625fecd1a4cd8e5521bd1c98627eca6f7565d
      https://github.com/llvm/llvm-project/commit/740625fecd1a4cd8e5521bd1c98627eca6f7565d
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2020-09-04 (Fri, 04 Sep 2020)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/oddshuffles.ll
    M llvm/test/CodeGen/X86/vector-shuffle-256-v16.ll
    M llvm/test/CodeGen/X86/vector-shuffle-256-v32.ll
    M llvm/test/CodeGen/X86/vector-shuffle-512-v32.ll
    M llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll
    M llvm/test/CodeGen/X86/vector-shuffle-combining.ll

  Log Message:
  -----------
  [X86] Make lowerShuffleAsLanePermuteAndPermute use sublanes on AVX2

Extends lowerShuffleAsLanePermuteAndPermute to search for opportunities to use vpermq (64-bit cross-lane shuffle) and vpermd (32-bit cross-lane shuffle) to get elements into the correct lane, in addition to the 128-bit full-lane permutes it previously searched for.

This is especially helpful in cross-lane byte shuffles, where the alternative tends to be "vpshufb both lanes separately and blend them with a vpblendvb", which is very expensive, especially on Haswell where vpblendvb uses the same execution port as all the shuffles.

Addresses PR47262

Patch By: @TellowKrinkle (TellowKrinkle)

Differential Revision: https://reviews.llvm.org/D86429




More information about the All-commits mailing list