[PATCH] D32352: Go to eleven

Wed May 3 06:57:12 PDT 2017

RKSimon added inline comments.

================
Comment at: test/CodeGen/X86/mul-constant-i16.ll:263
+; X64-NEXT:    addl %edi, %eax
 ; X64-NEXT:    # kill: %AX<def> %AX<kill> %EAX<kill>
 ; X64-NEXT:    retq
----------------
lsaba wrote:
> RKSimon wrote:
> > lsaba wrote:
> > > need to be careful when replacing one mul instruction with 2 lea instructions + other instructions, in the general case this should be faster, but if the RA chooses RBP/R13/EBP as the base register of the lea instruction  (instead of rdi in this case), the lea will be a slow lea so this sequence: 
> > > leal (%r13,%r13,2), %eax
> > > leal (%r13,%rax,4),%eax 
> > > addl %r13d, %eax
> > > 
> > > will be slower than the mul instruction. 
> > > 
> > > patch https://reviews.llvm.org/D32277 will attempt to replace this sequence with 
> > > leal (,%r13,2), %eax
> > > add %r13,%eax
> > > leal (,%rax,4),%eax 
> > > add %r13,%eax
> > > addl %r13d, %eax
> > > 
> > > which will cost ~the same as the mul instruction but with increase in code size.
> > So do you want to wait until D32277 has landed and only replace the 2*lea+other cases if !Subtarget.Slow3OpsLEA() ?
> It might be too restrictive to do this only when  !Subtarget.Slow3OpsLEA() since it's only for the specific case where the Register Allocator chooses the bad registers. 
> It is better to check the overall performance changes after https://reviews.llvm.org/D32277 before deciding. Maybe we can restrict the RA's choices somehow?
> 
> 
Not as a DAG combine, and even if/when this gets moved to the MC patterns it wouldn't work - the scheduler models can't take into account the register used AFAICT.

I don't think we have any combine that can interact with RA/post-RA in the way that we'd need.

https://reviews.llvm.org/D32352