[PATCH] D16729: [X86][SSE] Add general 32-bit LOAD + VZEXT_MOVL support to EltsFromConsecutiveLoads

Tue Feb 2 15:07:20 PST 2016

RKSimon added inline comments.

================
Comment at: test/CodeGen/X86/merge-consecutive-loads-128.ll:335
@@ -335,4 +334,3 @@
 ; AVX:       # BB#0:
-; AVX-NEXT:    vpinsrw $0, 6(%rdi), %xmm0, %xmm0
-; AVX-NEXT:    vpinsrw $1, 8(%rdi), %xmm0, %xmm0
+; AVX-NEXT:    vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
 ; AVX-NEXT:    retq
----------------
delena wrote:
> I think that in these architectures we pay additional cycle for switching from INT to FP. Can we use movd?
This was something I mentioned in the summary - adding domain support for MOVSS/MOVD is straightforward but has a knock on effect on a lot of tests, which would need some tests modifying to keep to the original domain and others we'd let switch. If you think its worthwhile I'll start looking at this more seriously?

================
Comment at: test/CodeGen/X86/merge-consecutive-loads-256.ll:531
@@ -549,5 +530,3 @@
 ; AVX:       # BB#0:
-; AVX-NEXT:    vpinsrb $0, 4(%rdi), %xmm0, %xmm0
-; AVX-NEXT:    vpinsrb $1, 5(%rdi), %xmm0, %xmm0
-; AVX-NEXT:    vpinsrb $3, 7(%rdi), %xmm0, %xmm0
+; AVX-NEXT:    vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
 ; AVX-NEXT:    retq
----------------
delena wrote:
> this instruction (movss) reads 4 bytes from memory. Does it require 4 bytes alignment?
Not unless SSE/AVX alignment checks are enabled - AFAICT llvm assumes they aren't. We are using the alignment of the base pointer, so lowering of the consecutive load is being driven from that.


Repository:
  rL LLVM

http://reviews.llvm.org/D16729