[PATCH] D29690: [AVX512] Improve EXTRACT_VECTOR_ELT with variable index.

Igor Breger via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 9 04:38:11 PST 2017


igorb added inline comments.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26982
     if (UnaryShuffle && (Depth >= 3 || HasVariableMask) && !MaskContainsZeros &&
-        ((Subtarget.hasAVX2() &&
-          (MaskVT == MVT::v8f32 || MaskVT == MVT::v8i32)) ||
-         (Subtarget.hasAVX512() &&
-          (MaskVT == MVT::v8f64 || MaskVT == MVT::v8i64 ||
-           MaskVT == MVT::v16f32 || MaskVT == MVT::v16i32)) ||
-         (Subtarget.hasBWI() && MaskVT == MVT::v32i16) ||
-         (Subtarget.hasBWI() && Subtarget.hasVLX() && MaskVT == MVT::v16i16) ||
-         (Subtarget.hasVBMI() && MaskVT == MVT::v64i8) ||
-         (Subtarget.hasVBMI() && Subtarget.hasVLX() && MaskVT == MVT::v32i8))) {
+        isVPERMVsupported(MaskVT, Subtarget)) {
       MVT VPermMaskSVT = MVT::getIntegerVT(MaskEltSizeInBits);
----------------
RKSimon wrote:
> Haven't you've lost the hasVLX() check by using isVPERMVsupported? 
I added patterns to TD file to handle 128/256 bit vector in case NoVLX using 512bit.
I think the check here only to insure VPERMV can be used. 


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1520
+; KNL-NEXT:    vmovq %rax, %xmm1
+; KNL-NEXT:    vpermps %zmm0, %zmm1, %zmm0
+; KNL-NEXT:    ## kill: %XMM0<def> %XMM0<kill> %ZMM0<kill>
----------------
delena wrote:
> It should be vpermpd. 
Thanks!


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1693
+; KNL-NEXT:    ## kill: %EDI<def> %EDI<kill> %RDI<def>
+; KNL-NEXT:    vmovaps %xmm0, -{{[0-9]+}}(%rsp)
+; KNL-NEXT:    andl $7, %edi
----------------
delena wrote:
> I propose to extend v8i16 to v8i32 and use vpermd.
Thanks Elena,
I prefer to handle all cases that can be extend in separate path.


================
Comment at: test/CodeGen/X86/avx512-insert-extract.ll:1780
+; KNL:       ## BB#0:
+; KNL-NEXT:    movzbl %dil, %eax
+; KNL-NEXT:    vmovd %eax, %xmm1
----------------
delena wrote:
> The result will be wrong. If you have '1' in MSB, the destination will be zeroed. You should mask the %index with 0xF:
> and $15, %edi
> vmovd %edi, %xmm
> 
Max 5 bits are used , according to LLVM doc "If idx exceeds the length of val, the results are undefined."


Repository:
  rL LLVM

https://reviews.llvm.org/D29690





More information about the llvm-commits mailing list