[PATCH][RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register

Wed Feb 5 13:18:15 PST 2014

On Feb 5, 2014, at 1:09 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
>> From: "Hal Finkel" <hfinkel at anl.gov>
>> To: "Quentin Colombet" <qcolombet at apple.com>
>> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at cs.uiuc.edu>
>> Sent: Wednesday, February 5, 2014 2:14:35 PM
>> Subject: Re: [PATCH][RegAlloc] Add a last chance recoloring mechanism when	everything else failed to find a register
>> 
>> ----- Original Message -----
>>> From: "Quentin Colombet" <qcolombet at apple.com>
>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>> Cc: "Commit Messages and Patches for LLVM"
>>> <llvm-commits at cs.uiuc.edu>, "Andy Trick" <atrick at apple.com>,
>>> "Jakob
>>> Stoklund Olesen" <stoklund at 2pi.dk>
>>> Sent: Wednesday, February 5, 2014 1:49:53 PM
>>> Subject: Re: [PATCH][RegAlloc] Add a last chance recoloring
>>> mechanism when everything else failed to find a register
>>> 
>>> Hi Hal,
>>> 
>>> 
>>> 
>>> On Feb 5, 2014, at 11:18 AM, Hal Finkel < hfinkel at anl.gov > wrote:
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>> 
>>> 
>>> From: "Quentin Colombet" < qcolombet at apple.com >
>>> To: "Jakob Stoklund Olesen" < stoklund at 2pi.dk >, "Hal Finkel" <
>>> hfinkel at anl.gov >
>>> Cc: "Commit Messages and Patches for LLVM" <
>>> llvm-commits at cs.uiuc.edu
>>>> , "Andy Trick" < atrick at apple.com >
>>> Sent: Wednesday, February 5, 2014 12:48:55 PM
>>> Subject: Re: [PATCH][RegAlloc] Add a last chance recoloring
>>> mechanism
>>> when everything else failed to find a register
>>> 
>>> Hi Hal, Jakob,
>>> 
>>> Thanks for the reviews.
>>> Attached is the updated patch.
>>> 
>>> Hal, just to clarify, this patch does not tackle any spilling
>>> problem. This patch “just” introduces a reshuffling (or recoloring)
>>> of the registers involved in coloring decisions when we expelled
>>> all
>>> of the others options (spilling included). This means we may have
>>> already done more spilling than necessary, but at least we may be
>>> able to allocate the current instance instead of crashing the
>>> compiler.
>>> 
>>> The two problems are orthogonal. Although I am interested in better
>>> spilling heuristics, I do not plan to work on that shortly.
>>> 
>>> Okay, well I agree that not crashing is the higher-priority goal ;)
>>> -- but in this context, I don't understand what is going on. Unless
>>> we have a single instruction that uses and defines more registers
>>> than the entire set, how can we exhaust all other options including
>>> spilling? It seems like the limiting trivial case is where we spill
>>> and reload all relevant registers around every instruction, and
>>> that
>>> cannot fail.
>>> 
>>> Unfortunately, this is not that easy, let me go back to my running
>>> example with more details:
>> 
>> Okay, thanks! I can see, with constraints, how this might happen.
>> 
>>> Assume the following register class constraints:
>>> 
>>> vA can use {R1, R2 }
>>> 
>>> vB can use { R2, R3}
>>> vC can use {R1 }
>>> 
>>> 
>>> Input code:
>>> vA = reload
>>> vB = reload
>>> vC = reload
>>> inst vA, vB, vC (three uses, like for a store value, base + offset)
>>> 
>>> 
>>> You cannot spill or split vA, vB, and vC, they are as short as
>>> possible.
>>> A coloring exists for this instance, but the regular heuristic may
>>> not find it.
>>> For instance, let us say the heuristic assigns the live-ranges
>>> top-down in the program order and uses the first available color:
>>> vA => R1
>>> vB => R2
>>> vC => stuck
>>> vA and vB were produced by spill, so you cannot do anything here.
>>> 
>>> 
>>> If you would have use a graph coloring approach, you would have
>>> ended
>>> with the same problem. Because of the constrained of the register
>>> class, none of the nodes has a degree less than 3, thus you cannot
>>> derive a simplification order that is guarantee to complete.
>>> 
>>> 
>>> The recoloring approach on the other hand, explore every
>>> possibility.
>>> 
>>> 
>>> In this particular case, like I said, we end up in that situation
>>> because of a lot of factors.
>>> 
>>> 
>>> 
>>> 
>>> With regard to the depth cuttoffs, then, if this is a correctness
>>> issue, then it seems like instead of skipping/aborting the hard
>>> cases, we should queue them to be revisited if no easier solution
>>> is
>>> found.
>>> I prefer not to go in that direction because like I said, this is a
>>> backtracking algorithm. Thus if a solution exists we will find it
>>> but maybe not in a reasonable amount of time. Thus, we cut the
>>> branches that are likely to fail. Also, like Eric mentioned, there
>>> may not be a feasible solution with inline asm for instance.
>> 
>> Okay, I'm not saying that you should not cut the branches on the
>> first pass, but it seems like you should queue them to be visited
>> later. Why do you not want to do this?
> 
> Thinking about it, I suppose the question is what do we want the worst-case behavior to be: crashing, or taking a *long* time (which the user might see as hanging). Perhaps the best solution is something like this:
> 
> - If recoloring fails, we report an error like this (but not crash): Register allocation failed, please retry with -fexhaustive-register-search
> - We add support for such an option, and this disables the depth check and other limits
I like this alternative.
Would it be acceptable that I commit the recoloring approach as it is, file a PR and fix it later?

If we want to explore cheap solutions first this will require a bit of refactoring and I’d like this to land sooner than later if possible.

> 
> This will allow the compiler to "work" for all users (even if it takes a long time), but will also allow knowledgeable developers, who will rightfully think this odd, to submit test cases (should this ever happen in practice).
> 
> Alternatively, we could stick the recoloring procedure in a loop with ever-increasing depth bounds.
> 
> What do you think?
Nice cross reply :).

-Quentin

> -Hal
> 
>> 
>> -Hal
>> 
>>> 
>>> 
>>> Cheers,
>>> -Quentin
>>> 
>>> 
>>> 
>>> 
>>> Thanks again,
>>> Hal
>>> 
>>> 
>>> 
>>> 
>>> On Feb 5, 2014, at 9:17 AM, Jakob Stoklund Olesen < stoklund at 2pi.dk
>>>> 
>>> wrote:
>>> 
>>> 
>>> 
>>> 
>>> On Feb 4, 2014, at 3:29 PM, Quentin Colombet < qcolombet at apple.com
>>>> 
>>> wrote:
>>> 
>>> 
>>> 
>>> Hi Jakob,
>>> 
>>> The attached patch introduces a last chance recoloring mechanism
>>> when the current allocation scheme fails to assign a register.
>>> Thanks for your review.
>>> 
>>> ** Context **
>>> 
>>> In some extreme conditions the current allocation heuristic may
>>> fail to find a valid allocation solution whereas one exists.
>>> This is demonstrated with the (big) test case that is contained in
>>> that patch.
>>> Basically, in that test case, the greedy register allocator runs
>>> out of registers because of a combination of:
>>> - The way the machine scheduler extends some physical register
>>> live-ranges, which end up putting a lot of contraints on the
>>> available registers.
>>> - The relocation model, which consumes one register.
>>> - The function attributes, which forces to keep a register for the
>>> frame pointer.
>>> - The weight of the different variables, which affect the
>>> allocation order.
>>> 
>>> Hi Quentin,
>>> 
>>> The patch looks good to me, but please address Hal’s concerns.
>>> 
>>> I can see how this last-chance recoloring can be necessary,
>>> particularly when dealing with constrained register classes and
>>> inline assembly. However, I am not sure that the machine scheduler
>>> is doing the right thing if it is extending physical register live
>>> ranges. As I see it, physreg live ranges should always have a COPY
>>> in one end, and the copies should be placed to make the live
>>> ranges as short as possible.
>>> I agree and I was surprised to see such extended live-ranges for
>>> physical registers.
>>> Like you said and like I showed in my motivating example, the
>>> recoloring may still be needed for constrained register classes.
>>> Thus, I will pursue with that last chance approach.
>>> 
>>> I’ll look into the extended live-ranges problem as some point,
>>> because I believe this can improve the quality of the allocation in
>>> some cases (here for example :)).
>>> 
>>> 
>>> 
>>> Even when the scheduler is tracking register pressure, it is
>>> extremely difficult to guarantee that a valid register allocation
>>> exists if physical live ranges are extended.
>>> Agree.
>>> 
>>> 
>>> 
>>> We had this same problem back when the coalescer was extending
>>> physreg live ranges, and it was a constant source of obscure “ran
>>> out of registers” bugs like your test case.
>>> 
>>> Thanks,
>>> /jakob
>>> 
>>> Thanks again,
>>> -Quentin
>>> 
>>> 
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140205/2c66acf0/attachment.html>