[llvm-commits] New "BigBlock" local register allocator
duraid at kinoko.c.u-tokyo.ac.jp
Thu Jun 21 08:20:46 PDT 2007
Attached is a new local register allocator tuned for big basic
blocks. It is fast and doesn't use much memory, but it is *slightly* slower
and does use more memory than the existing local allocator. Having said
that, it quickly gets close-to-optimal allocations on very large basic
blocks, and it doesn't give *terrible* results on complex functions, so it
might make sense to use this allocator by default in a JIT context. I
haven't yet been able to narrow down a case where the existing local
allocator produces better code than this one.
Anyway, to play with it, just drop it into lib/Codegen, but you'll
also need to declare it in:
and for testing, you might want to edit lib/Codegen/Passes.cpp to make it
the default allocator. I've built llvm-gcc in this way, and the results are
pretty reasonable. The allocator hasn't choked on anything so far.
As you'll quickly notice, the file is basically a copy of
RegAllocLocal.cpp. The only bits changed are the bits implementing the new
algorithm, which is very simple. It's just:
"At every instruction, if you have to spill a register, greedily spill the
one whose value isn't going to be read again for the longest amount of
To do this, two passes over each basic block are performed. The
first pass builds a table which lists the times that each virtual register
is read. The second pass does the allocation proper, spilling registers
according to the "won't be needed longest" rule.
The only thing I am not happy about is the "InsnTimes" map, which
just maps all (unallocated) instructions in the basic-block to the integers
0,1,2... Surely there is some way I can do without this map? (I guess I just
need to thread a "currentTime" value through to chooseReg() and
reloadVirtReg()?) I tried this but messed it up, I probably just need more
coffee. ;) Seriously though, if anyone can find any problems or suggest any
improvements, I'd be really happy to hear them.
Well, the only other question I have is: can I check this in as a
seperate allocator, or should we "upgrade" the existing local allocator to
this one? This allocator will never be as fast, or use as little memory as
the "simple" local allocator, and for those reasons alone I'm thinking we
may as well keep the existing local allocator. However, compared to
linearscan, or almost any other imaginable "heavy duty" allocator, any
difference in the efficiency is probably going to be imperceptible. (On my
own JIT workload, BigBlock is >30% slower than Local, but even so, it is
still <1% of the total codegen time.)
OK, enough rambling: please take a look - flames welcome!!
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 34156 bytes
Desc: not available
More information about the llvm-commits