[LLVMdev] Optimization opportunity

Reid Spencer reid at x10sys.com
Sat Aug 28 22:20:57 PDT 2004


Jeff,

Chris isn't likely to respond to this for a while as he's on vacation. 
I'll take a look at it and will commit it if it looks good. Since code
gen isn't my specialty, could you increase my comfort level a little by
giving me some examples of the test results you got when testing your
patches? Ideally, I'd like to see some of the test/Programs/MultiSource
programs working and generating correct code with this path.

Thanks,

Reid.

On Sat, 2004-08-28 at 21:21, Jeff Cohen wrote:
> On Fri, 27 Aug 2004 02:20:34 -0500 (CDT)
> Chris Lattner <sabre at nondot.org> wrote:
> 
> > On Thu, 26 Aug 2004, Jeff Cohen wrote:
> > 
> > > Also, the store into the arrays generates two x86 machine
> > > instructions:
> > >
> > > 	lea %ECX, DWORD PTR [%ESP + 16]
> > > 	mov DWORD PTR [%ECX + <element offset>], %EAX
> > >
> > > These can be combined into a single instruction.  I'm tempted to
> > > work on this one myself :)
> > 
> > Yup, there are several things that can be improved in the X86
> > instruction selector.  If you submit a patch to implement this, we
> > would be happy to apply it.  Continuing improvements in the code
> > generator should eventually make this kind of thing fall out
> > automatically, but for now they must be implemented manually.
> > 
> > -Chris
> 
> I succumbed to temptation and made the improvement.  Diffs are attached
> for X86ISelSimple.cpp and X86InstrBuilder.h.
> 
> I determined that the reason two instructions are generated in the first
> place, instead of being folded immediately into one, is because locals
> do not have a physical offset assigned at that time.  There is a
> peephole optimization pass after the stack frame is finalized, but the
> problem with folding them there is that the physical registers have also
> been assigned, so register CX (in this case) would be wasted.
> 
> So I generalized the concept of a "full address".  There were four
> variables, base reg, scale, index reg and displacement, that were being
> passed around.  I converted them into a single struct and added a new
> type field that allows the base reg to instead be a frame index.  The
> extra lea instruction is now folded immediately, and the code for
> processing store instructions shrinks quite nicely.
> 
> This should be generalizable even further to handle global variables.  I
> didn't go that far, because it appeared there may be other work
> required.  It wasn't clear a displacement to a global was supported.
> 
> Jeff
> 
> ______________________________________________________________________
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20040828/0ad538e3/attachment.sig>


More information about the llvm-dev mailing list