[LLVMdev] New Register Allocation Algorithm
Fernando Magno Quintao Pereira
fernando at CS.UCLA.EDU
Wed Jul 18 10:57:46 PDT 2007
Dear LLVMers,
we here in the UCLA compiler's lab have an implementation of a very
cool register allocator in LLVM. It finds an optimal register assignment
for the x86 machine, via live range splitting, that is, if it is possible
to have a register assignment without register spilling, our allocator can
find it.
We have the short and long versions of our paper here:
http://compilers.cs.ucla.edu/fernando/projects/puzzles/
Our algorithm is polynomial time, and up to now, it is adding about
15% more to the total compilation time of LLVM. It does not iterate like
the current implementation of RegAllocLinearScan (LN), and does not have
to worry about conflicts between virtuals and physical registers, two
things that bring the running time of RegAllocLinearScan down. I am not an
expert in C++, but I think that with a careful implementation, it would
even be possible to match the compilation time of LN. Indeed, for the very
large benchmarks, the algorithm sometimes is faster.
We have not implemented instruction folding yet. Nothing prevent us of
implementing it, it is just a matter of time (and patience). Currently the
code that our algorithm produces is faster than LN, if I use
-disable-spill-fusing, and sometimes it is faster even when I let LN to
use instruction folding.
Of course, because the algorithm is very new, there are a lot of
things yet to improve, and we would like to hear comments from you guys.
We did not do any global coalescing, and our spilling heuristics is the
simplest possible: when having to decide among register to spill, evict
those that have he longest interval. I had implemented a pass to build the
dominator tree of MachineBasicBlocks, but my pass uses the naive quadratic
algorithm, and I have this in my list of things to improve. Also, our
algorithm only works for x86, because I hard coded the relations between
aliased registers in the target independent part of the implementation.
This is also something to improve. To implement the allocator, I had to
change the implementation of LLVM code here and there: adding a function
to insert swaps in MRegisterInfo, changing the coalescing routine in
LiveIntervalAnalysis, etc.
Anyway, if you guys could take a look into our paper, I think you
would like it. The idea is quite new, and if possible, we would like to
produce a version that could be distributed with LLVM. But in this case,
the work is huge, for the quality of my implementation is way below the
quality of the implementation of the current linear scan algorithm.
All the best,
Fernando
More information about the llvm-dev
mailing list