[PATCH] D38128: Handle COPYs of physregs better (regalloc hints)

Fri Mar 16 19:53:44 PDT 2018

niravd added a comment.

This looks like a nice improvement modulo a few issues:

- In a number of places (notably register arguments to shift instructions) we generate movq instead of movl that's shorter and equivalent. I believe there's no real performance difference. Certainly we should for when compiling for size.
- There's some unnecessary shuffling of register names in SSE4 tests.

Someone who's a more familiar with Debug should double check those tests though they seem fine to me.

================
Comment at: test/CodeGen/X86/fast-isel-shift.ll:22
 ; CHECK-NEXT:    movl %edi, %eax
+; CHECK-NEXT:    ## kill: def $cx killed $cx killed $ecx
+; CHECK-NEXT:    ## kill: def $cl killed $cx
----------------
Why are we getting two kill comments about cx here? 

================
Comment at: test/CodeGen/X86/schedule-x86-64-shld.ll:129
 ; GENERIC:       # %bb.0: # %entry
-; GENERIC-NEXT:    movl %edx, %ecx # sched: [1:0.33]
-; GENERIC-NEXT:    shldq %cl, %rsi, %rdi # sched: [4:1.50]
+; GENERIC-NEXT:    movq %rdx, %rcx # sched: [1:0.33]
 ; GENERIC-NEXT:    movq %rdi, %rax # sched: [1:0.33]
----------------
This should be movl given optsize.

================
Comment at: test/CodeGen/X86/sret-implicit.ll:13
 ; X64-LABEL: sret_void
-; X64-DAG: movl $0, (%rdi)
+; X64-DAG: movl $0, (%rax)
 ; X64-DAG: movq %rdi, %rax
----------------
This shouldn't be DAG matches anymore the 2nd line should always come first

================
Comment at: test/CodeGen/X86/sret-implicit.ll:27
 ; X64-LABEL: sret_demoted
-; X64-DAG: movq $0, (%rdi)
+; X64-DAG: movq $0, (%rax)
 ; X64-DAG: movq %rdi, %rax
----------------
Same as above

================
Comment at: test/CodeGen/X86/vector-shift-ashr-128.ll:270
 ; SSE41:       # %bb.0:
-; SSE41-NEXT:    movdqa %xmm0, %xmm2
-; SSE41-NEXT:    movdqa %xmm1, %xmm0
+; SSE41-NEXT:    movdqa %xmm1, %xmm2
+; SSE41-NEXT:    movdqa %xmm0, %xmm1
----------------
All of the SSE4 changes regarding shifts seem to generate unnecessary register shuffling.

================
Comment at: test/CodeGen/X86/vectorcall.ll:26
+; X86: movl %ecx, %eax
+; X64: movq %rcx, %rax

----------------
another case of movq vs movl

================
Comment at: test/CodeGen/X86/vectorcall.ll:152
 ; CHECK-LABEL: test_mixed_5
-; CHECK:       movaps	%xmm5, 16(%{{(e|r)}}sp)
-; CHECK:       movaps	%xmm5, %xmm0
+; CHECK-DAG:   movaps	%xmm{{[0,5]}}, 16(%{{(e|r)}}sp)
+; CHECK-DAG:   movaps	%xmm5, %xmm0
----------------
This is probably fine, but confusing. 

Can you decompose this up into the X86 and X64? In fact, this file should probably be autogenerated with utils/update_llc_test_checks.py 

https://reviews.llvm.org/D38128