[PATCH] D41062: [X86] Legalize v2i32 via widening rather than promoting

Sun Jan 7 23:02:49 PST 2018

zvi added a comment.

There are some regressions that need to be addressed (or we decide to accept), but overall your approach seems right to me.

================
Comment at: test/CodeGen/X86/avx2-masked-gather.ll:770
+; X86-NEXT:    vpxor %xmm2, %xmm2, %xmm2
+; X86-NEXT:    vpcmpgtq %xmm0, %xmm2, %xmm0
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
----------------
This patch does not change mask argumenent representation, so his compare is redundant, right?

================
Comment at: test/CodeGen/X86/avx2-masked-gather.ll:772
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    vmovd {{.*#+}} xmm2 = mem[0],zero,zero,zero
+; X86-NEXT:    vpinsrd $1, 4(%eax), %xmm2, %xmm2
----------------
Any way to easily fix vmovd+vpinsrd -> vmovq?

================
Comment at: test/CodeGen/X86/avx2-masked-gather.ll:774
+; X86-NEXT:    vpinsrd $1, 4(%eax), %xmm2, %xmm2
+; X86-NEXT:    vmovdqa %xmm0, %xmm0
+; X86-NEXT:    vgatherdpd %ymm0, (,%xmm2), %ymm1
----------------
Is this redundant move a known issue?

================
Comment at: test/CodeGen/X86/shrink_vmul.ll:54
 ; X86-AVX-NEXT:    movl c, %esi
-; X86-AVX-NEXT:    vpmovzxbq {{.*#+}} xmm0 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero
-; X86-AVX-NEXT:    vpmovzxbq {{.*#+}} xmm1 = mem[0],zero,zero,zero,zero,zero,zero,zero,mem[1],zero,zero,zero,zero,zero,zero,zero
+; X86-AVX-NEXT:    movzbl 1(%edx,%ecx), %edi
+; X86-AVX-NEXT:    movzbl (%edx,%ecx), %edx
----------------
Two more missed vmovq opportunities

================
Comment at: test/CodeGen/X86/shuffle-vs-trunc-128.ll:251
+; AVX512:       # %bb.0:
+; AVX512-NEXT:    vpermilps {{.*#+}} xmm0 = mem[0,2,2,3]
+; AVX512-NEXT:    vmovlps %xmm0, (%rsi)
----------------
What happened here?

================
Comment at: test/CodeGen/X86/shuffle-vs-trunc-128.ll:276
+; AVX2-SLOW-NEXT:    vmovaps (%rdi), %xmm0
+; AVX2-SLOW-NEXT:    vpermilps {{.*#+}} ymm0 = ymm0[0,2,2,3,4,6,6,7]
+; AVX2-SLOW-NEXT:    vpermpd {{.*#+}} ymm0 = ymm0[0,2,2,3]
----------------
What about this?

https://reviews.llvm.org/D41062