[PATCH] D17691: [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG

Sun Feb 28 06:32:22 PST 2016

RKSimon added inline comments.

================
Comment at: test/CodeGen/X86/avx512-ext.ll:116
@@ -115,2 +115,3 @@
 ; SKX-NEXT:    vpmovzxbw {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero
+; SKX-NEXT:    vmovdqu16 %ymm0, %ymm0 {%k1} {z}
 ; SKX-NEXT:    retq
----------------
delena wrote:
> Hi Simon,
> 
> Why do we need an additional instruction here?
> vpmovzxbw       %xmm0, %ymm0 {%k1} {z}        does the work
Hi - that was what I was asking yourself + Igor. I don't know much about how the masking lowering work in AVX512, but for some reason these VZEXT (which in this case has come via VECTOR_SHUFFLE lowering) don't correctly combine with the masks.

Now, I'm tempted to avoid this issue by just not combining cases where ZERO_EXTEND is legal and extends the whole register, but it looks like its just hiding a bigger problem. I'll see if I can create a simplified repro if you wish?


Repository:
  rL LLVM

http://reviews.llvm.org/D17691