[LLVMdev] TargetRegisterInfo and "infinite" register files

Mon May 16 06:52:25 PDT 2011

Currently, the TableGen register info files for all of the back-ends define
concrete registers and divide them into logical register classes.  I would
like to get some input from the LLVM experts around here on how best to map
this model to an architecture that does *not* have a concrete, pre-defined
register file.  The architecture is PTX, which is more of an intermediate
form than a final assembly language.  The format is essentially
three-address code, with "virtual" registers instead of "physical"
registers.  After PTX code generation, the PTX assembly is compiled to a
device binary with a proprietary tool (ptxas) that does final register
allocation (based on device and user constraints).  However, exploiting
register re-use at the LLVM/PTX level has shown performance improvement over
blindly using a new "physical" register for each def and letting ptxas
figure out all of the register allocation details, so I would like to take
advantage of the LLVM register allocation infrastructure if at all possible.

Generally stated, I would like to solve the register allocation problem as
"allocate the minimum number of registers from an arbitrary set without
spill code" instead of the more traditional "allocate the minimum number of
registers from a fixed set."

The current implementation defines an arbitrary set of registers that the
register allocator can use during code-gen.  This works, but is not
scalable.  If the register allocator runs out of registers, spill code must
be generated.  However, the "optimal" solution in this case would be to
extend the register file.  A few alternatives I have come up with are:

   1. Bypass register allocation completely and just emit virtual registers,
   2. Remove register definitions from the TableGen files and create them at
   run-time using the virtual register counts as an upper bound on the number
   of registers needed, or
   3. Keep a small set of pre-defined physical registers, and craft spill
   code that really just puts a new register definition in the final PTX and
   copies to/from this register when spilling/restoring is needed

I hesitate to use (1) or (3) as they rely too heavily on the final ptxas
tool to perform reasonable register allocation, which may not lead to
optimal code.  Option (2) seems promising, though I worry about the
feasibility of the approach.  Specifically, I am not yet sure if generating
TargetRegisterInfo and TargetRegisterClass instances on-the-fly will fit
into the existing architecture.

Any thoughts from the experts out there?  Specifically, I am interested in
any non-trivial pros/cons for any of these approaches, or any new approaches
I have not considered.

Thanks!

-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110516/f0434fef/attachment.html>