[LLVMdev] allocating registers less "sparingly"
Evan Cheng
evan.cheng at apple.com
Mon Nov 5 09:21:55 PST 2007
On Nov 5, 2007, at 2:55 AM, Pekka Jääskeläinen wrote:
> Hello LLVM people,
>
> Our customizable TTA target [1] is capable of having plenty of
> registers
> and register file ports to improve instruction level parallelism and
> reduce spills. It's totally up to the designer of the particular TTA
> processor how much the processor has registers and register file
> resources
> along with other TTA components.
>
> We have ported LLVM 2.1 to produce an intermediate TTA program
> format with
> registers allocated (it adapts to the architecture's register
> files) but without
> operations bound to target's function units, nor scheduled to
> instructions.
> The operation binding and the final scheduling is done as a post
> pass in
> our TTA-framework called TCE [2] (not yet released to public).
>
> The problem is that the register allocators in LLVM seem to reuse
> registers
> too much, that is, allocate them "too sparingly". This leads to false
> dependencies which unnecessarily restrict scheduling freedom in our
> post pass
> instruction scheduler. That is, even if the target has 64
> registers, LLVM
> might only use, say 13, and have multiple writes to the same
> register in
> the same basic block. Is there some easy way to tune this behavior
> so it
> reuses registers only if it really has to (when it runs out of new
> ones)?
What you want is to remove anti-dependency before post-allocation
scheduling. Alas, there is nothing in current LLVM implementation
that deals with this. There are quite a few papers on this topic,
e.g. http://citeseer.ist.psu.edu/calland97removal.html
I am also very interested in this. Do you plan to contribute your
work back to the community? :-)
Evan
>
> I can understand that for many architectures without static
> scheduling etc.
> the "allocate registers sparingly"-approach is the best one as it
> might lead
> to less context data to be saved in calls, etc., but in our case
> the calls are
> not usually the bottleneck as our target programs are at the moment
> mainly DSP
> kernels and multimedia codecs which are usually loop-oriented. In
> addition,
> LLVM seems to do an excellent job in inlining function calls.
>
> I look forward to your replies.
>
> The links:
>
> [1] http://en.wikipedia.org/wiki/Transport_triggered_architecture
> [2] http://tce.cs.tut.fi/papers/TCE_Overview.pdf
>
> Best regards,
> --
> Pekka Jääskeläinen,
> Researcher at Tampere Univ. of Technology, Finland.
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list