[PATCH] D86093: [X86][AVX] Lower v16i8/v8i16 shuffles using VTRUNC/TRUNCATE
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 17 12:36:04 PDT 2020
RKSimon added a comment.
In D86093#2222047 <https://reviews.llvm.org/D86093#2222047>, @craig.topper wrote:
> It looks like we may have already doing it in some cases, but is a VTRUNC for xmm->xmm really better the VPSHUFB? VTRUNC is 2 port 5 uops. VPSHUFB is 1 port 5 uop.
That sounds reasonable to me - although there's the inevitable question of the cost of loading the shuffle mask - I'll limit it to just binary shuffles, which is the cause of the regressions in D66004 <https://reviews.llvm.org/D66004>.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:11399
+ unsigned EltSizeInBits = VT.getScalarSizeInBits();
+ if (Mask.size() != NumElts)
+ return SDValue();
----------------
craig.topper wrote:
> When does this condition happen? Doesn't Mask always follow VT in shuffle lowering?
This is just copy+paste from lowerShuffleWithVPMOV - neither actually need it
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86093/new/
https://reviews.llvm.org/D86093
More information about the llvm-commits
mailing list