[all-commits] [llvm/llvm-project] 363d27: [x86] fold vperm2x128 to concat of 128-bit high ha...

Wed Jan 22 12:38:54 PST 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 363d27c871f44c45bb70a8adfb0ad93a0bf2e04d
      https://github.com/llvm/llvm-project/commit/363d27c871f44c45bb70a8adfb0ad93a0bf2e04d
  Author: Sanjay Patel <spatel at rotateright.com>
  Date:   2020-01-22 (Wed, 22 Jan 2020)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/x86-interleaved-access.ll

  Log Message:
  -----------
  [x86] fold vperm2x128 to concat of 128-bit high half vectors

vperm (ins ?, X, C), (ins ?, Y, C), 0x31 --> concat X, Y

This is another shuffle problem seen with PR42024:
https://bugs.llvm.org/show_bug.cgi?id=42024

We have this small crack in legalization/lowering/combining/demanded
that allows forming a vperm2f128 of high halves with AVX1 when we
could do better by peeking through the insert_subvector nodes.
AFAICT, it requires IR as shown in the diffs - much larger than legal
vectors - to avoid all of the usual folds.

Another option would prevent forming the 256-bit vperm in lowering.

Differential Revision: https://reviews.llvm.org/D73197