[LLVMdev] "Ran out of registers during register allocation" bug affecting ffmpeg

Eli Friedman eli.friedman at gmail.com
Mon Aug 30 19:38:07 PDT 2010

On Mon, Aug 30, 2010 at 5:52 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
> On Aug 21, 2010, at 5:27 PM, Eli Friedman wrote:
>> See http://llvm.org/bugs/show_bug.cgi?id=4668 and
>> http://llvm.org/bugs/show_bug.cgi?id=5010.  The basic description of
>> the issue (from http://llvm.org/bugs/show_bug.cgi?id=4668#c5): "The
>> fundamental problem is we can't spill a register once it's fixed to a
>> physical register."
>>> From discussion on IRC:
>> [17:14]       <_sabre_>       efriedma: sounds like a RA bug in linscan
>> [17:14]       <_sabre_>       unfortunately, it is a fairly well known bug that is
>> difficult to fix
>> [17:14]       <_sabre_>       it would be worth bringing up on llvmdev, there may
>> be a good approach since I stopped looking at the regalloc
>> [17:14]       <_sabre_>       in any case, its worth surfacing the issue so that
>> Jakob can keep it in mind as he's building the new RA infrastructure
> I am not sure I completely understand the problem here since the PRs don't reproduce on TOT, but it sounds like a physical register is live across an inline asm that needs all the registers.

I've uploaded a re-reduced testcase to PR4668; your guess appears to
be correct, or at least close.

> This can happen if the register coalescer decides to coalesce a physreg with a virtreg that is live across the inline asm. This is not easy to detect without a bad compile time regression.
> I suppose something similar could happen for calls. For instance, some calls clobber all XMM registers, so a physical XMM register coalesced to be live across a call would be unspillable.
> What can I say? Physreg coalescing is evil ;-)
> I want to remove physreg coalescing entirely, but it requires the register allocator to be really good at taking hints. We are not quite there yet.

I measure a 1.7% increase in code-size compiling gcc with
-disable-physical-join vs. normal compilation on x86-64.  That's
pretty substantial.  Looking over the generated code, it looks like
we're missing a lot of cases which seem like they should be easy, like
not putting an immediately returned PHI node into eax, or calculating
a value and immediately moving it into another register for a call.
The difference isn't so bad on x86-32, only 0.25%.  I think that means
the primary issue with using -disable-physical-join is that we're
doing a really bad job of putting the arguments to calls in the right

Is the issue that the correct hints aren't there, or that the
allocation algorithm isn't using them well?

> A quick fix would be to disable physreg coalescing for functions containing inline asm.

On x86-32, that probably wouldn't be such a big deal, but the effect
looks really bad for x86-64, and probably other architectures that
pass arguments in registers.


