[PATCH] D96609: [X86][AVX] Truncate vectors with PACKSS/PACKUS on AVX2 targets
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 23 18:05:21 PST 2021
craig.topper added inline comments.
================
Comment at: llvm/test/CodeGen/X86/vector-reduce-and-bool.ll:559
+; AVX2-NEXT: vpblendw {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3],ymm1[4],ymm2[5,6,7],ymm1[8],ymm2[9,10,11],ymm1[12],ymm2[13,14,15]
+; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm2[1,2,3],ymm0[4],ymm2[5,6,7],ymm0[8],ymm2[9,10,11],ymm0[12],ymm2[13,14,15]
+; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0
----------------
pengfei wrote:
> RKSimon wrote:
> > xbolva00 wrote:
> > > Worse?
> > We remove lane crossing shuffles, a pshufb (so no constant pool mask load) and a domain crossing shufps. Some AVX2 targets won't care but others will (e.g. znver1 will love losing the lane shuffles).
> So it means some targets worse and some better?
Arent most lane crossing shuffles on Intel 3 cycles?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96609/new/
https://reviews.llvm.org/D96609
More information about the llvm-commits
mailing list