[llvm-dev] Using constraint solving to support complex unspillable register classes?

Wed Jan 6 11:41:34 PST 2016

In the backend that I'm making for my custom architecture, I have some
registers that I want to be allocatable but not spillable. I know for a
fact that the registers can always be allocated without spilling this
class if the scheduling is done correctly - they're only used by code
inserted by my backend, and each insertion defines and kills them.

My architecture has several ALUs, which I've exposed using a few special
instructions along with one super-register per ALU. For example, in my
architecture, something like an 'add' is 4 completely separate
instructions:
set_alu_in1 aluA, R1
set_alu_in2, aluA, R2
add aluA
get_alu_out R3, aluA

Unfortunately, once scheduling is enabled, these instructions may be
interleaved such that spilling parts of the ALU registers is required
(which causes a failure when attempting to store it to the stack).

Currently, I have a custom machine scheduler strategy that schedules
bottom-up and picks the 'best' instruction that keeps the register
pressure of the unspillable class from rising past the limit. This is
simple enough right now, because I have each alu super-register as a
completely distinct entity.

I'd like to make my alu registers overlap, so that aluA.out is the same
register as aluC.in1 and aluB.out is the same register as aluC.in2, for
example. However, my current simple machine scheduler wouldn't handle this
properly, because it only deals with register classes and doesn't
generalize easily.

The real goal of my scheduler is to pick the "best" instruction that still
allows for register allocation to succeed, so an obvious direction to go
is towards doing some of the work that the register allocator does. After
some reading up on register allocation theory and practice, it seems like
PBQP is the best approach for "irregular" architectures like I'm trying to
make. In fact, switching to the PBQP allocator allowed me to remove a
"register pre-allocation" pass that added allocation hints to all ALU
instructions with my current non-overlapping ALU architecture.

For the scheduling phase, I'm only interested in whether or not a solution
(with non-infinite cost) exists, so it seems like I could drop a lot of
data from the full PBQP solver - I only need binary weights (zero or
infinite), and I can leave out all nodes that don't include any
unspillable registers.

Is using a PBQP model and solver the best way to solve this? It seems like
it should be a simpler problem than normal register allocation if the
choice of spilling is removed. After reading up on constraint solving a
little, it seems to me that without a special spill class, the assignments
could be expressed using a horn formula, and horn satisfiability can be
determined in linear time (to the number of symbols, which would be equal
to the sum of valid choices for each register). Would that be a better
way?

Any feedback on solving this constraint problem to allow scheduling with
complicated unspillable register is appreciated.

Thanks,
James Byerly