[PATCH][RegAlloc] Make tryInstructionSplit less aggressive.
qcolombet at apple.com
Fri Dec 20 13:20:07 PST 2013
Thanks Jakob for the quick reply.
On Dec 20, 2013, at 12:41 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
> Hi Quentin,
> You also need to handle MI operands that don’t have RC constraints, and operands with sub-register indexes. Look at MRI::recomputeRegClass().
When you say MI operands that don’t have RC constraints you mean the ones that have NULL RC constraints, right?
Currently the code split those, but we can consider this being too aggressive indeed.
Anyway, thanks for the pointer, I will have a look.
> I also think you should look at how many more *allocatable* registers the new register class has (see RegClassInfo), and perhaps even require that the new class has 2-4 more allocatable registers before you consider the split.
I am not sure I want to go in that direction.
Indeed, what we currently have is just a check that we have more registers (just one is enough) if we split everywhere.
// There is no point to this if there are no larger sub-classes.
I am a bit reluctant to change that as one register may be enough for our purpose.
I guess you are suggesting that to avoid even more useless splitting. However, I do not think requiring that the new class has 2-4 more allocatable registers would be a good criterion.
> Could the doesSplitRelax* functions be formulated in a way where they may be generally useful as MI functions? Something like answering “How does this MI constrain the RC of VirtReg?”. It seems that MRI::recomputeRegClass() could also use such a function.
Good point, I think so.
I will look in to that.
> On Dec 20, 2013, at 11:52 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>> Hi Jakob,
>> Here is a patch proposal for the aggressive splitting problem we talked together.
>> If there is a simpler way to get the operand constraints from the live-range, I would be glad to update the patch accordingly.
>> ** Context **
>> The greedy register allocator tries to split a live-range around each instruction where it is used or defined to relax the constraints on the entire live-range (this is a last chance split before falling back to spill).
>> The goal is to have a big live-range that is unconstrained (i.e., that can use the largest legal register class) and several small local live-range that carry the constraints implied by each instruction.
>> Let csti be the constraints on operation i.
>> op1 V1(cst1)
>> op2 V1(cst2)
>> V1 live-range is constrained on the intersection of cst1 and cst2.
>> tryInstructionSplit relaxes those constraints by aggressively splitting each def/use point:
>> V2 = V1
>> V3 = V2
>> op1 V3(cst1)
>> V4 = V2
>> op2 V4(cst2)
>> Because of how the coalescer infrastructure works, each new variable (V3, V4) that is alive at the same time as V1 (or its copy, here V2) interfere with V1. Thus, we end up with an uncoalescable copy for each split point.
>> The added test case demonstrates this problem.
>> ** Proposed Solution **
>> Make tryInstructionSplit less aggressive.
>> To do that, we check if the split point actually relaxes the constraints on the whole live-range. If it does not, we do not insert it.
>> Indeed, it will not help the global allocation problem:
>> - V1 will have the same constraints.
>> - V1 will have the same interference + possibly the newly added split variable VS.
>> - VS will produce an uncoalesceable copy if alive at the same time as V1.
>> Note: During my measurements, I did not see any compile time or runtime regressions/improvement although several split points were not inserted. Measures were made on armv7s and x86_64 with LLVM test-sutie + external.
>> Thanks for your feedback.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits