[LLVMdev] allocating registers less "sparingly"

Mon Nov 5 02:55:22 PST 2007

Hello LLVM people,

Our customizable TTA target [1] is capable of having plenty of registers
and register file ports to improve instruction level parallelism and
reduce spills. It's totally up to the designer of the particular TTA
processor how much the processor has registers and register file resources
along with other TTA components.

We have ported LLVM 2.1 to produce an intermediate TTA program format with
registers allocated (it adapts to the architecture's register files) but without
operations bound to target's function units, nor scheduled to instructions.
The operation binding and the final scheduling is done as a post pass in
our TTA-framework called TCE [2] (not yet released to public).

The problem is that the register allocators in LLVM seem to reuse registers
too much, that is, allocate them "too sparingly". This leads to false
dependencies which unnecessarily restrict scheduling freedom in our post pass
instruction scheduler. That is, even if the target has 64 registers, LLVM
might only use, say 13, and have multiple writes to the same register in
the same basic block. Is there some easy way to tune this behavior so it
reuses registers only if it really has to (when it runs out of new ones)?

I can understand that for many architectures without static scheduling etc.
the "allocate registers sparingly"-approach is the best one as it might lead
to less context data to be saved in calls, etc., but in our case the calls are
not usually the bottleneck as our target programs are at the moment mainly DSP
kernels and multimedia codecs which are usually loop-oriented. In addition,
LLVM seems to do an excellent job in inlining function calls.

I look forward to your replies.

The links:

[1] http://en.wikipedia.org/wiki/Transport_triggered_architecture
[2] http://tce.cs.tut.fi/papers/TCE_Overview.pdf

Best regards,
-- 
Pekka Jääskeläinen,
Researcher at Tampere Univ. of Technology, Finland.