[LLVMdev] TargetRegisterInfo and "infinite" register files

Justin Holewinski justin.holewinski at gmail.com
Mon May 16 11:30:48 PDT 2011


On Mon, May 16, 2011 at 12:40 PM, Villmow, Micah <Micah.Villmow at amd.com>wrote:

> Justin,
>
> We have the same issue with the AMDIL code generator. We tried #1, but
> there are passes after register allocator that don’t like virtual registers.
> #3 could be done by having the two spill functions
> [load|store]Reg[From|To]StackSlot keep track of the FrameIndex to register
> mapping internally, but again, more of a hack than a proper solution.
>

After reading Jakob's comments, I think (3) may end up being the best in the
long term.  I'll definitely post any results to the list!


>
>
> My solution was to just create a very large register file, 768 registers,
> that no sane kernel would ever reach and then do register allocation within
> that. A simple script that is run at compile time to generate the tables
> into a separate .td file and have that included in the necessary locations
> is all that is needed so it doesn’t bloat the code.
>

That is essentially what happens now, the only difference being the register
description file is generated during dev-time instead of compile-time.  I
just feel there should be a more "scalable" approach.


>
>
> Micah
>
> *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On
> Behalf Of *Justin Holewinski
> *Sent:* Monday, May 16, 2011 6:52 AM
> *To:* LLVM Developers Mailing List
> *Subject:* [LLVMdev] TargetRegisterInfo and "infinite" register files
>
>
>
> Currently, the TableGen register info files for all of the back-ends define
> concrete registers and divide them into logical register classes.  I would
> like to get some input from the LLVM experts around here on how best to map
> this model to an architecture that does *not* have a concrete, pre-defined
> register file.  The architecture is PTX, which is more of an intermediate
> form than a final assembly language.  The format is essentially
> three-address code, with "virtual" registers instead of "physical"
> registers.  After PTX code generation, the PTX assembly is compiled to a
> device binary with a proprietary tool (ptxas) that does final register
> allocation (based on device and user constraints).  However, exploiting
> register re-use at the LLVM/PTX level has shown performance improvement over
> blindly using a new "physical" register for each def and letting ptxas
> figure out all of the register allocation details, so I would like to take
> advantage of the LLVM register allocation infrastructure if at all possible.
>
>
>
> Generally stated, I would like to solve the register allocation problem as
> "allocate the minimum number of registers from an arbitrary set without
> spill code" instead of the more traditional "allocate the minimum number of
> registers from a fixed set."
>
>
>
> The current implementation defines an arbitrary set of registers that the
> register allocator can use during code-gen.  This works, but is not
> scalable.  If the register allocator runs out of registers, spill code must
> be generated.  However, the "optimal" solution in this case would be to
> extend the register file.  A few alternatives I have come up with are:
>
>    1. Bypass register allocation completely and just emit virtual
>    registers,
>    2. Remove register definitions from the TableGen files and create them
>    at run-time using the virtual register counts as an upper bound on the
>    number of registers needed, or
>    3. Keep a small set of pre-defined physical registers, and craft spill
>    code that really just puts a new register definition in the final PTX and
>    copies to/from this register when spilling/restoring is needed
>
> I hesitate to use (1) or (3) as they rely too heavily on the final ptxas
> tool to perform reasonable register allocation, which may not lead to
> optimal code.  Option (2) seems promising, though I worry about the
> feasibility of the approach.  Specifically, I am not yet sure if generating
> TargetRegisterInfo and TargetRegisterClass instances on-the-fly will fit
> into the existing architecture.
>
>
>
> Any thoughts from the experts out there?  Specifically, I am interested in
> any non-trivial pros/cons for any of these approaches, or any new approaches
> I have not considered.
>
>
>
> Thanks!
>
>
>
>
>
> --
>
> Thanks,
>
>
>
> Justin Holewinski
>
>
>



-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110516/59c155a7/attachment.html>


More information about the llvm-dev mailing list