[llvm-commits] [LLVMdev] BigBlock register allocator

Roman Levenstein romixlev at yahoo.com
Sun Jun 24 14:18:06 PDT 2007


 Hi Duraid,
 
Here is a promised patch. It makes the VigBlock regalloc faster (almost
as fast as Local), removes some unused code derived from the Local
regalloc and also fixes one or two bugs. More precisely:
 
  1) InsnTimes map was removed completely
 
  2) For each vector of VReg read occurences In VRegReadTable current
index is maintained. It allows for expecting only those occurences that
are in the future and avoids looking at the old references.
 
  3) PhysRegsUseOrder is not used any more
 
  4) There was a bug in your code when an instruction was changed due
to the memory operands folding. The new instruction had always an
incorrect time (i.e. 0), because it was not found in the InsnTimes map.
It lead to  wrong results in chooseReg(). Fixed.
 
  Duraid, I send your the whole file, since there are quite some
changes.  Please review it and commit if you think it is OK.
 
  I'm looking forward to getting from you some examples of very big BBs
for testing.
 
  -Roman
 
> --- Roman Levenstein <romixlev at yahoo.com> schrieb:
> 
> > Hi Duraid,
> > 
> > > Hi everyone,
> > > 
> > > Quick summary:
> > > 
> > > 	LLVM now has a new register allocator particularly suitable for 
> > > compiling (very) large, machine-generated functions.
> > 
> > Congrats! Very good job!
> >  
> > > Longer story:
> > > 
> > > 	I've recently been using LLVM in an application that involves
> > JITing
> > > 
> > > fairly large functions that have no control flow - they're just
> > flat 
> > > sequences of instructions, anywhere from 100 to 10000+ in size.
> > (The 
> > > control flow is all in the host program, which works out which
> > > monster  function to call and when.)
> >  
> > > The default (linearscan) register allocator wasn't doing a good
> > job,
> > as 
> > > it doesn't (yet) have live range splitting. It would quickly use
> > all 
> > > available registers and then get stuck, using only a single
> > register 
> > > (and the stack, a lot!) for hundreds or thousands of instructions
> > at
> > > a time, greatly slowing (+bloating) the code. 
> > 
> > True. I'm working on the version of the linear scan based on
> Wimmer's
> > thesis. It supports live range splitting. I'd like to compare it
> with
> > yours. Do you have any good examples of those fairly large
> functions
> > that are just flat sequences of instructions, anywhere from 100 to
> > 10000+ in size??? It would be nice, if you could send me those test
> > cases (as C or ll files). I could use it then as a basis for a
> > comparision and report about results.
> >  
> > > 	The good news is the new "BigBlock" allocator turns out to 
> > > produce even  better code than the local allocator when blocks
> are
> > very
> > > large. We're talking a +10~20% speed boost on average. (If your
> > basic
> > 
> > > blocks are small, or there's not much register pressure, you'll 
> > > actually get the same code out of both local and BigBlock.)
> > 
> > Do you have numbers comparing it to the current version of the
> LLVM's
> > linear scan? The win of your allocator over the linear scan should
> be
> > even better, I guess. 
> > 
> >  
> > > 	While BigBlock isn't (and never will be) as fast as the local 
> > > allocator, it's not much slower, doesn't use much memory, and is 
> > > certainly faster than linearscan. So if you're compiling very
> > large, 
> > > (probably) machine-generated blocks of straight-line code, give
> the
> > 
> > > Local and BigBlock allocators a try, especially if you're JITing
> > > things and compile time is important.
> > 
> > I looked at your code. And I see some things that could be
> > significantlty sped up, e.g.
> >  - InsnTimes handling. I have the feeling, this map can be
> eliminated
> > completely.
> >  - use of the VRegReadTable. The vector of read occurences can be
> > shortened every time, you processed the corresponding intruction.
> > This
> > makes it shorter and makes searches inside this vector faster, thus
> > making chooseReg much faster. Probably also some other
> optimizations
> > can be applied to the chooseReg function.
> >  - PhysRegsUseOrder - you remove some elements from the middle of
> > this
> > vector in removePhysReg. This is not a very efficient operation on
> > the
> > vectors, since it need to copy the tail of the vector. I think
> using
> > a
> > list data-structure could be much more efficient for this purpose
> > 
> > I think these changes may significantely improve the performance of
> > your BigBlock register allocator. I'll try to come up with some
> more
> > concrete proposals or even patches over the week-end or next week. 
> > 
> > -Roman


__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails. 
http://mail.yahoo.com 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: RegAllocBigBlock.cpp
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20070624/c3542125/attachment.ksh>


More information about the llvm-commits mailing list