[PATCH] try to lowerVectorShuffleAsElementInsertion() for all 256-bit vector sub-types [X86, AVX]
Andrea Di Biagio
Andrea_DiBiagio at sn.scee.net
Mon Mar 30 10:41:27 PDT 2015
================
Comment at: test/CodeGen/X86/vector-shuffle-256-v8.ll:134-137
@@ -133,6 +133,6 @@
; AVX2: # BB#0:
; AVX2-NEXT: movl $7, %eax
; AVX2-NEXT: vmovd %eax, %xmm1
-; AVX2-NEXT: vpxor %ymm2, %ymm2, %ymm2
-; AVX2-NEXT: vpblendd {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
+; AVX2-NEXT: vxorps %ymm2, %ymm2, %ymm2
+; AVX2-NEXT: vblendps {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
; AVX2-NEXT: vpermps %ymm0, %ymm1, %ymm0
----------------
spatel wrote:
> andreadb wrote:
> > This has nothing to do with your patch, however, I am surprised that we get this long sequence of instructions on AVX2 instead of just a single 'vmovaps' plus 'vpermd'.
> > Here, %ymm1 is used to store the 'vpermd' permute mask. That mask is basically known at compile time (it is vector <7,0,0,0,0,0,0,0>) so, we could just have a load from constant pool instead of computing the mask at runtime. I think we could replace this entire sequence with a load from constant pool followed by a 'vpermd'.
> Interesting - it's not entirely unrelated because the permute mask itself could be viewed as a zero-extended vector, right? I've filed this as:
> https://llvm.org/bugs/show_bug.cgi?id=23073
Right,
```
movl $7, %eax
vmovd %eax, %xmm1
vxorps %ymm2, %ymm2, %ymm2
vblendps {{.*#+}} ymm1 = ymm1[0],ymm2[1,2,3,4,5,6,7]
```
is basically equivalent to:
```
movl $7, %eax
vmovd %eax, %xmm1
```
Bits [VLMAX-1:32] would be implicitly zeroed.
http://reviews.llvm.org/D8341
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list