[PATCH] D56784: [X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffle
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 24 14:31:30 PST 2019
craig.topper added inline comments.
================
Comment at: test/CodeGen/X86/buildvec-extract.ll:408
+; SSE-NEXT: pslldq {{.*#+}} xmm0 = zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,xmm0[0,1,2,3]
+; SSE-NEXT: psrldq {{.*#+}} xmm0 = xmm0[14,15],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
+; SSE-NEXT: retq
----------------
lebedev.ri wrote:
> I'm having trouble reading this pretty-print.
> Shouldn't it be something more like
> ```
> psrldq {{.*#+}} xmm0 = zero,zero,zero,zero,zero,zero,xmm0[14,15],zero,zero,zero,zero,zero,zero,zero,zero
> ```
> ?
> Otherwise to me it reads as-if it wasn't `zext i16 to 16`, but `zext i16 to i64 + shl (64-16)`
> (i.e. zeros are not in MSB, but LSB)
The pretty printer prints LSB first. So the pslldq is putting bytes 0, 1, 2, and 3 of the original vector in the MSBs. Then the pslrdq takes bytes 14 and 15 from that which are really bytes 2 and 3 of the input and moves them to byte 0 and 1 of the output.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56784/new/
https://reviews.llvm.org/D56784
More information about the llvm-commits
mailing list