[LLVMdev] Paired register allocation problem
martinwguy at gmail.com
Mon Feb 22 08:28:54 PST 2010
A related question: GCC does the same for 64-bit args in two 32-bit
registers, but always uses (Rn, Rn+1) pairs and is incapable of
scheduling the two 32-bit move instructions independently, since the
two are output at the very last minute by the last part of the
This is slow when moving such a 2x32-bit value to a 64-bit register in
one the ARM FPUs, since two 32-bit moves to the low and high halves
of the same FPU register incur a 7-cycle delay - a fairly common
occurrence when double args are moved into FPU regs for processing. If
it could schedule the two 32-bit moves separately, up to 6 other
instructions could be executed in the mean time.
The only answer in GCC seems to be a further optimization pass in the
back end performed after the assembly generation pass has been done,
to shuffle the adjacent inns when they don't supply or depend on the
moved values - horrid.
I was wondering if this would be more practical in LLVM (of which I am
ignorant but curious) or whether the illusion of a single 64-bit
register also persists there until it is too late.
More information about the llvm-dev