[PATCH] D50074: [X86][AVX2] Prefer VPBLENDW+VPBLENDW+VPBLENDD to VPBLENDVB for v16i16 blend shuffles

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 6 04:33:40 PDT 2018


RKSimon added a comment.

Cheers Peter, I'm going to look at adding combining shuffles to VPBLENDVB/VPBLENDMB in the target shuffle combiner. We already have a 'variable mask' threshold mechanism that allows recent Intel CPUs to merge >2 shuffles to a single variable mask shuffle so the 2*VPBLENDW+VPLENDD regression case can be avoided on those targets (see the 'SLOW' vs 'FAST' codegen checks above).

I can look at combine shuffles to VPTERNLOG in the future if/when its requested.


Repository:
  rL LLVM

https://reviews.llvm.org/D50074





More information about the llvm-commits mailing list