[PATCH] D115180: [X86] Enable v16i8/v32i8/v64i8 rotation on AVX512 targets

Mon Dec 13 16:58:43 PST 2021

pengfei added inline comments.

================
Comment at: llvm/test/CodeGen/X86/vector-fshr-rot-128.ll:626
 ; AVX512F-NEXT:    vpmovzxbd {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero
-; AVX512F-NEXT:    vpsrlvd %zmm3, %zmm0, %zmm3
-; AVX512F-NEXT:    vpxor %xmm4, %xmm4, %xmm4
-; AVX512F-NEXT:    vpsubb %xmm1, %xmm4, %xmm1
-; AVX512F-NEXT:    vpand %xmm2, %xmm1, %xmm1
+; AVX512F-NEXT:    vpslld $8, %zmm0, %zmm2
+; AVX512F-NEXT:    vpord %zmm2, %zmm0, %zmm0
----------------
RKSimon wrote:
> pengfei wrote:
> > Why do we still widen here rather that use avx instruction like other cases?
> I'm sorry I don't quite understand - do you mean you'd expect the (much longer) vpblendvb sequence?
Sorry, my mistake. I thought the use of zmm register results from the lack of avx512vl. Now I understand we are using the full 512 bit register. Sorry for the noise.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115180/new/

https://reviews.llvm.org/D115180