[PATCH] D50074: [X86][AVX2] Prefer VPBLENDW+VPBLENDW+VPBLENDD to VPBLENDVB for v16i16 blend shuffles

Tue Jul 31 08:22:05 PDT 2018

RKSimon created this revision.
RKSimon added reviewers: craig.topper, zvi, delena, lebedev.ri, pcordes.

Noticed while looking at https://reviews.llvm.org/D49562 codegen - we can avoid a large constant mask load and a slow VPBLENDVB select op by using VPBLENDW+VPBLENDW+VPBLENDD instead - the VPBLENDW can run in parallel (if they both occur).

TODO: We should investigate adding VPBLENDVB handling to target shuffle combining as well.

Should we be preferring VPBLENDVB/VSELECT for AVX512 targets?

Repository:
  rL LLVM

https://reviews.llvm.org/D50074

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/insertelement-ones.ll
  test/CodeGen/X86/oddshuffles.ll
  test/CodeGen/X86/prefer-avx256-mask-shuffle.ll
  test/CodeGen/X86/vector-shuffle-256-v16.ll
  test/CodeGen/X86/vector-shuffle-256-v32.ll
  test/CodeGen/X86/vector-shuffle-512-v32.ll
  test/CodeGen/X86/vector-shuffle-v48.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D50074.158275.patch
Type: text/x-patch
Size: 72777 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180731/f46c6bda/attachment-0001.bin>