[PATCH] D29690: [AVX512] Improve EXTRACT_VECTOR_ELT with variable index.

Elena Demikhovsky via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 8 23:04:04 PST 2017


delena added inline comments.


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1505
+; SKX:       ## BB#0:
+; SKX-NEXT:    movslq %edi, %rax
+; SKX-NEXT:    vmovq %rax, %xmm1
----------------
vmovslq + vmovq can be replaced with vmovd


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1520
+; KNL-NEXT:    vmovq %rax, %xmm1
+; KNL-NEXT:    vpermps %zmm0, %zmm1, %zmm0
+; KNL-NEXT:    ## kill: %XMM0<def> %XMM0<kill> %ZMM0<kill>
----------------
It should be vpermpd. 


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1693
+; KNL-NEXT:    ## kill: %EDI<def> %EDI<kill> %RDI<def>
+; KNL-NEXT:    vmovaps %xmm0, -{{[0-9]+}}(%rsp)
+; KNL-NEXT:    andl $7, %edi
----------------
I propose to extend v8i16 to v8i32 and use vpermd.


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1723
+; KNL-NEXT:    subq $64, %rsp
+; KNL-NEXT:    ## kill: %EDI<def> %EDI<kill> %RDI<def>
+; KNL-NEXT:    vmovaps %ymm0, (%rsp)
----------------
Again, you can extend to v16i32.


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1780
+; KNL:       ## BB#0:
+; KNL-NEXT:    movzbl %dil, %eax
+; KNL-NEXT:    vmovd %eax, %xmm1
----------------
The result will be wrong. If you have '1' in MSB, the destination will be zeroed. You should mask the %index with 0xF:
and $15, %edi
vmovd %edi, %xmm



================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1840
+; SKX_ONLY-NEXT:    .cfi_def_cfa_register %rbp
+; SKX_ONLY-NEXT:    andq $-32, %rsp
+; SKX_ONLY-NEXT:    subq $64, %rsp
----------------
You can extend to v32i16 and use VPERMW


Repository:
  rL LLVM

https://reviews.llvm.org/D29690





More information about the llvm-commits mailing list