[PATCH][RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register

Hal Finkel hfinkel at anl.gov
Wed Feb 5 13:18:45 PST 2014


----- Original Message -----
> From: "Quentin Colombet" <qcolombet at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at cs.uiuc.edu>, "Andy Trick" <atrick at apple.com>, "Jakob
> Stoklund Olesen" <stoklund at 2pi.dk>
> Sent: Wednesday, February 5, 2014 3:08:54 PM
> Subject: Re: [PATCH][RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register
> 
> 
> 
> 
> On Feb 5, 2014, at 12:14 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> ----- Original Message -----
> 
> 
> From: "Quentin Colombet" < qcolombet at apple.com >
> To: "Hal Finkel" < hfinkel at anl.gov >
> Cc: "Commit Messages and Patches for LLVM" < llvm-commits at cs.uiuc.edu
> >, "Andy Trick" < atrick at apple.com >, "Jakob
> Stoklund Olesen" < stoklund at 2pi.dk >
> Sent: Wednesday, February 5, 2014 1:49:53 PM
> Subject: Re: [PATCH][RegAlloc] Add a last chance recoloring mechanism
> when everything else failed to find a register
> 
> Hi Hal,
> 
> 
> 
> On Feb 5, 2014, at 11:18 AM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> ----- Original Message -----
> 
> 
> From: "Quentin Colombet" < qcolombet at apple.com >
> To: "Jakob Stoklund Olesen" < stoklund at 2pi.dk >, "Hal Finkel" <
> hfinkel at anl.gov >
> Cc: "Commit Messages and Patches for LLVM" < llvm-commits at cs.uiuc.edu
> 
> 
> , "Andy Trick" < atrick at apple.com >
> Sent: Wednesday, February 5, 2014 12:48:55 PM
> Subject: Re: [PATCH][RegAlloc] Add a last chance recoloring mechanism
> when everything else failed to find a register
> 
> Hi Hal, Jakob,
> 
> Thanks for the reviews.
> Attached is the updated patch.
> 
> Hal, just to clarify, this patch does not tackle any spilling
> problem. This patch “just” introduces a reshuffling (or recoloring)
> of the registers involved in coloring decisions when we expelled all
> of the others options (spilling included). This means we may have
> already done more spilling than necessary, but at least we may be
> able to allocate the current instance instead of crashing the
> compiler.
> 
> The two problems are orthogonal. Although I am interested in better
> spilling heuristics, I do not plan to work on that shortly.
> 
> Okay, well I agree that not crashing is the higher-priority goal ;)
> -- but in this context, I don't understand what is going on. Unless
> we have a single instruction that uses and defines more registers
> than the entire set, how can we exhaust all other options including
> spilling? It seems like the limiting trivial case is where we spill
> and reload all relevant registers around every instruction, and that
> cannot fail.
> 
> Unfortunately, this is not that easy, let me go back to my running
> example with more details:
> 
> Okay, thanks! I can see, with constraints, how this might happen.
> 
> 
> 
> Assume the following register class constraints:
> 
> vA can use {R1, R2 }
> 
> vB can use { R2, R3}
> vC can use {R1 }
> 
> 
> Input code:
> vA = reload
> vB = reload
> vC = reload
> inst vA, vB, vC (three uses, like for a store value, base + offset)
> 
> 
> You cannot spill or split vA, vB, and vC, they are as short as
> possible.
> A coloring exists for this instance, but the regular heuristic may
> not find it.
> For instance, let us say the heuristic assigns the live-ranges
> top-down in the program order and uses the first available color:
> vA => R1
> vB => R2
> vC => stuck
> vA and vB were produced by spill, so you cannot do anything here.
> 
> 
> If you would have use a graph coloring approach, you would have ended
> with the same problem. Because of the constrained of the register
> class, none of the nodes has a degree less than 3, thus you cannot
> derive a simplification order that is guarantee to complete.
> 
> 
> The recoloring approach on the other hand, explore every possibility.
> 
> 
> In this particular case, like I said, we end up in that situation
> because of a lot of factors.
> 
> 
> 
> 
> With regard to the depth cuttoffs, then, if this is a correctness
> issue, then it seems like instead of skipping/aborting the hard
> cases, we should queue them to be revisited if no easier solution is
> found.
> I prefer not to go in that direction because like I said, this is a
> backtracking algorithm. Thus if a solution exists we will find it
> but maybe not in a reasonable amount of time. Thus, we cut the
> branches that are likely to fail. Also, like Eric mentioned, there
> may not be a feasible solution with inline asm for instance.
> 
> Okay, I'm not saying that you should not cut the branches on the
> first pass, but it seems like you should queue them to be visited
> later. Why do you not want to do this?
> 
> 
> I am concerned with the compile time. Exploring all possibilities may
> take a (very) long time and I’d rather have a compiler that fails
> “ran out of registers” than a user killing the process because he
> believes it is stuck in some infinite loop. I believe the latter
> will give a chance to the user to report the problem with a nicely
> clang generated script and moreover he may rewrite his program to
> avoid that in the meantime.

I have mixed feelings about crashing in this case. Yes, it gives us a higher probability of getting test cases, but it seems like we could do better somehow. On the other hand (as I mentioned in the e-mail that I must have been writing while you wrote this), seeming to hang is also not a good solution.

> 
> 
> I see your concern: we may miss viable solutions that were dismissed
> because of some arbitrary cut decisions. I think the cl::opt you
> suggested give us a good way to check if these cut decisions were
> bad or not.
> 
> 
> At the moment, this is hard to judge how good or bad they are. Same
> thing for what to do with the dismissed solutions. My point is:
> always queueing them may not be the best approach but I have no hard
> evidence, this is hard to trick the register allocator into ran out
> of registers whereas a solution exists!
> 
> 
> What do you think?

I think that committing this with some cl::opts will be fine for now; it is certainly an improvement over the present situation, and gives the user the ability to work around any problem with the cutoffs if necessary. There may be a better solution, but we'll probably need to await more data.

 -Hal

> 
> 
> -Quentin
> 
> 
> 
> 
> 
> 
> 
> 
> -Hal
> 
> 
> 
> 
> 
> Cheers,
> -Quentin
> 
> 
> 
> 
> Thanks again,
> Hal
> 
> 
> 
> 
> On Feb 5, 2014, at 9:17 AM, Jakob Stoklund Olesen < stoklund at 2pi.dk >
> wrote:
> 
> 
> 
> 
> On Feb 4, 2014, at 3:29 PM, Quentin Colombet < qcolombet at apple.com >
> wrote:
> 
> 
> 
> Hi Jakob,
> 
> The attached patch introduces a last chance recoloring mechanism
> when the current allocation scheme fails to assign a register.
> Thanks for your review.
> 
> ** Context **
> 
> In some extreme conditions the current allocation heuristic may
> fail to find a valid allocation solution whereas one exists.
> This is demonstrated with the (big) test case that is contained in
> that patch.
> Basically, in that test case, the greedy register allocator runs
> out of registers because of a combination of:
> - The way the machine scheduler extends some physical register
> live-ranges, which end up putting a lot of contraints on the
> available registers.
> - The relocation model, which consumes one register.
> - The function attributes, which forces to keep a register for the
> frame pointer.
> - The weight of the different variables, which affect the
> allocation order.
> 
> Hi Quentin,
> 
> The patch looks good to me, but please address Hal’s concerns.
> 
> I can see how this last-chance recoloring can be necessary,
> particularly when dealing with constrained register classes and
> inline assembly. However, I am not sure that the machine scheduler
> is doing the right thing if it is extending physical register live
> ranges. As I see it, physreg live ranges should always have a COPY
> in one end, and the copies should be placed to make the live
> ranges as short as possible.
> I agree and I was surprised to see such extended live-ranges for
> physical registers.
> Like you said and like I showed in my motivating example, the
> recoloring may still be needed for constrained register classes.
> Thus, I will pursue with that last chance approach.
> 
> I’ll look into the extended live-ranges problem as some point,
> because I believe this can improve the quality of the allocation in
> some cases (here for example :)).
> 
> 
> 
> Even when the scheduler is tracking register pressure, it is
> extremely difficult to guarantee that a valid register allocation
> exists if physical live ranges are extended.
> Agree.
> 
> 
> 
> We had this same problem back when the coalescer was extending
> physreg live ranges, and it was a constant source of obscure “ran
> out of registers” bugs like your test case.
> 
> Thanks,
> /jakob
> 
> Thanks again,
> -Quentin
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list