[PATCH] try to lowerVectorShuffleAsElementInsertion() for all 256-bit vector sub-types [X86, AVX]

Andrea Di Biagio Andrea_DiBiagio at sn.scee.net
Mon Mar 30 09:07:08 PDT 2015


Hi Sanjay,


================
Comment at: test/CodeGen/X86/vector-shuffle-256-v4.ll:830
@@ -833,5 +829,3 @@
 ; AVX1:       # BB#0:
-; AVX1-NEXT:    vmovq {{.*#+}} xmm0 = mem[0],zero
-; AVX1-NEXT:    vxorpd %ymm1, %ymm1, %ymm1
-; AVX1-NEXT:    vblendpd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3]
+; AVX1-NEXT:    vmovsd {{.*#+}} xmm0 = mem[0],zero
 ; AVX1-NEXT:    retq
----------------
So, this is what you meant when you said that we don't get the correct fp/int domain.
In X86InstrSSE.td we have patterns like this:
```
  def : Pat<(v4i64 (X86vzmovl (insert_subvector undef, (v2i64 (scalar_to_vector (loadi64 addr:$src))), (iPTR 0)))), (SUBREG_TO_REG (i32 0), (VMOVSDrm addr:$src), sub_xmm)>;
```
Do you plan to send a follow-up patch to fix tablegen patterns so that VMOVQI2PQIrm is used instead of VMOVSDrm for the integer domain?. If that's the case, then it makes sense to commit this patch first and fix the fp/int domain issue in a separate patch.

================
Comment at: test/CodeGen/X86/vector-shuffle-256-v8.ll:134-137
@@ -133,6 +133,6 @@
 ; AVX2:       # BB#0:
 ; AVX2-NEXT:    movl $7, %eax
 ; AVX2-NEXT:    vmovd %eax, %xmm1
-; AVX2-NEXT:    vpxor %ymm2, %ymm2, %ymm2
-; AVX2-NEXT:    vpblendd {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
+; AVX2-NEXT:    vxorps %ymm2, %ymm2, %ymm2
+; AVX2-NEXT:    vblendps {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
 ; AVX2-NEXT:    vpermps %ymm0, %ymm1, %ymm0
----------------
This has nothing to do with your patch, however, I am surprised that we get this long sequence of instructions on AVX2 instead of just a single 'vmovaps' plus 'vpermd'.
Here, %ymm1 is used to store the 'vpermd' permute mask. That mask is basically known at compile time (it is vector <7,0,0,0,0,0,0,0>) so, we could just have a load from constant pool instead of computing the mask at runtime. I think we could replace this entire sequence with a load from constant pool followed by a 'vpermd'.

================
Comment at: test/CodeGen/X86/vector-shuffle-256-v8.ll:963-967
@@ -962,7 +962,7 @@
 ; AVX2:       # BB#0:
 ; AVX2-NEXT:    movl $7, %eax
 ; AVX2-NEXT:    vmovd %eax, %xmm1
-; AVX2-NEXT:    vpxor %ymm2, %ymm2, %ymm2
-; AVX2-NEXT:    vpblendd {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
+; AVX2-NEXT:    vxorps %ymm2, %ymm2, %ymm2
+; AVX2-NEXT:    vblendps {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
 ; AVX2-NEXT:    vpermd %ymm0, %ymm1, %ymm0
 ; AVX2-NEXT:    retq
----------------
Same here.

http://reviews.llvm.org/D8341

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list