[PATCH] D50074: [X86][AVX2] Prefer VPBLENDW+VPBLENDW+VPBLENDD to VPBLENDVB for v16i16 blend shuffles
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 6 04:33:40 PDT 2018
RKSimon added a comment.
Cheers Peter, I'm going to look at adding combining shuffles to VPBLENDVB/VPBLENDMB in the target shuffle combiner. We already have a 'variable mask' threshold mechanism that allows recent Intel CPUs to merge >2 shuffles to a single variable mask shuffle so the 2*VPBLENDW+VPLENDD regression case can be avoided on those targets (see the 'SLOW' vs 'FAST' codegen checks above).
I can look at combine shuffles to VPTERNLOG in the future if/when its requested.
Repository:
rL LLVM
https://reviews.llvm.org/D50074
More information about the llvm-commits
mailing list