[PATCH] D62498: [x86] split 256-bit store of concatenated vectors

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 27 15:05:06 PDT 2019


spatel marked 2 inline comments as done.
spatel added inline comments.


================
Comment at: llvm/test/CodeGen/X86/oddsubvector.ll:119-126
+; AVX-NEXT:    vmovaps (%rdi), %xmm0
+; AVX-NEXT:    vmovaps 16(%rdi), %xmm1
+; AVX-NEXT:    vmovaps 32(%rdi), %xmm2
+; AVX-NEXT:    vmovaps 48(%rdi), %xmm3
+; AVX-NEXT:    vmovaps %xmm2, 16(%rsi)
+; AVX-NEXT:    vmovaps %xmm3, (%rsi)
+; AVX-NEXT:    vmovaps %xmm0, 48(%rsi)
----------------
This seems like a failure of load combining? Even so, the split code has less uops than before even if the instruction count increased.


================
Comment at: llvm/test/CodeGen/X86/vector-gep.ll:211
 ; CHECK-NEXT:    retl $4
   %A = getelementptr i16, i16* %param, <64 x i32> %off
   ret <64 x i16*> %A
----------------
We're obviously spilling here, but I'm not sure what is happening underneath or if this is an important test for perf rather than just correctness/crashing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62498/new/

https://reviews.llvm.org/D62498





More information about the llvm-commits mailing list