[PATCH] D29690: [AVX512] Improve EXTRACT_VECTOR_ELT with variable index.
Elena Demikhovsky via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 8 23:04:04 PST 2017
delena added inline comments.
================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1505
+; SKX: ## BB#0:
+; SKX-NEXT: movslq %edi, %rax
+; SKX-NEXT: vmovq %rax, %xmm1
----------------
vmovslq + vmovq can be replaced with vmovd
================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1520
+; KNL-NEXT: vmovq %rax, %xmm1
+; KNL-NEXT: vpermps %zmm0, %zmm1, %zmm0
+; KNL-NEXT: ## kill: %XMM0<def> %XMM0<kill> %ZMM0<kill>
----------------
It should be vpermpd.
================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1693
+; KNL-NEXT: ## kill: %EDI<def> %EDI<kill> %RDI<def>
+; KNL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)
+; KNL-NEXT: andl $7, %edi
----------------
I propose to extend v8i16 to v8i32 and use vpermd.
================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1723
+; KNL-NEXT: subq $64, %rsp
+; KNL-NEXT: ## kill: %EDI<def> %EDI<kill> %RDI<def>
+; KNL-NEXT: vmovaps %ymm0, (%rsp)
----------------
Again, you can extend to v16i32.
================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1780
+; KNL: ## BB#0:
+; KNL-NEXT: movzbl %dil, %eax
+; KNL-NEXT: vmovd %eax, %xmm1
----------------
The result will be wrong. If you have '1' in MSB, the destination will be zeroed. You should mask the %index with 0xF:
and $15, %edi
vmovd %edi, %xmm
================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1840
+; SKX_ONLY-NEXT: .cfi_def_cfa_register %rbp
+; SKX_ONLY-NEXT: andq $-32, %rsp
+; SKX_ONLY-NEXT: subq $64, %rsp
----------------
You can extend to v32i16 and use VPERMW
Repository:
rL LLVM
https://reviews.llvm.org/D29690
More information about the llvm-commits
mailing list