[PATCH] D32352: Go to eleven

Wed May 3 06:02:43 PDT 2017

lsaba added inline comments.

================
Comment at: test/CodeGen/X86/mul-constant-i16.ll:263
+; X64-NEXT:    addl %edi, %eax
 ; X64-NEXT:    # kill: %AX<def> %AX<kill> %EAX<kill>
 ; X64-NEXT:    retq
----------------
RKSimon wrote:
> lsaba wrote:
> > need to be careful when replacing one mul instruction with 2 lea instructions + other instructions, in the general case this should be faster, but if the RA chooses RBP/R13/EBP as the base register of the lea instruction  (instead of rdi in this case), the lea will be a slow lea so this sequence: 
> > leal (%r13,%r13,2), %eax
> > leal (%r13,%rax,4),%eax 
> > addl %r13d, %eax
> > 
> > will be slower than the mul instruction. 
> > 
> > patch https://reviews.llvm.org/D32277 will attempt to replace this sequence with 
> > leal (,%r13,2), %eax
> > add %r13,%eax
> > leal (,%rax,4),%eax 
> > add %r13,%eax
> > addl %r13d, %eax
> > 
> > which will cost ~the same as the mul instruction but with increase in code size.
> So do you want to wait until D32277 has landed and only replace the 2*lea+other cases if !Subtarget.Slow3OpsLEA() ?
It might be too restrictive to do this only when  !Subtarget.Slow3OpsLEA() since it's only for the specific case where the Register Allocator chooses the bad registers. 
It is better to check the overall performance changes after https://reviews.llvm.org/D32277 before deciding. Maybe we can restrict the RA's choices somehow?

https://reviews.llvm.org/D32352