[PATCH] D62498: [x86] split 256-bit store of concatenated vectors
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 27 15:05:06 PDT 2019
spatel marked 2 inline comments as done.
spatel added inline comments.
================
Comment at: llvm/test/CodeGen/X86/oddsubvector.ll:119-126
+; AVX-NEXT: vmovaps (%rdi), %xmm0
+; AVX-NEXT: vmovaps 16(%rdi), %xmm1
+; AVX-NEXT: vmovaps 32(%rdi), %xmm2
+; AVX-NEXT: vmovaps 48(%rdi), %xmm3
+; AVX-NEXT: vmovaps %xmm2, 16(%rsi)
+; AVX-NEXT: vmovaps %xmm3, (%rsi)
+; AVX-NEXT: vmovaps %xmm0, 48(%rsi)
----------------
This seems like a failure of load combining? Even so, the split code has less uops than before even if the instruction count increased.
================
Comment at: llvm/test/CodeGen/X86/vector-gep.ll:211
; CHECK-NEXT: retl $4
%A = getelementptr i16, i16* %param, <64 x i32> %off
ret <64 x i16*> %A
----------------
We're obviously spilling here, but I'm not sure what is happening underneath or if this is an important test for perf rather than just correctness/crashing.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D62498/new/
https://reviews.llvm.org/D62498
More information about the llvm-commits
mailing list