[PATCH] Speed up creation of live ranges for physical registers by using a segment set

Gasiunas, Vaidas vaidas.gasiunas at sap.com
Fri Dec 19 00:56:40 PST 2014


It is likely that the patch does not provide an ideal overall design for computation of live ranges. My interest is to resolve the compile-time regression in some of our large examples, which we experience since LLVM 3.3. I basically contribute a patch, which we currently apply locally, and as such it is designed to be a minimal change to resolve the regression. It preserves the original design and the algorithms, just reapplying them on a collection with a better sorted insert performance. On request of reviewers, I tried to further minimize the change by using templates to share the algorithm among set and vector-based implementations.

I think, however, that better design could be achieved by changing the overall design of LiveRangeCalc. Maybe it would be better to manipulate the segment set directly in in the LiveRangeCalc and create the LiveRange only as the last step of the computation. Also, it can be that the use of the LiveRangeUpdater in LiveRangeCalc is now obsolete, because it useless in combination with the segment set, but am not sure I am the right person to answer such questions.

Regards,
Vaidas


From: Matthias Braun [mailto:mbraun at apple.com]
Sent: Freitag, 19. Dezember 2014 00:31
To: Quentin Colombet
Cc: reviews+D6013+public+9ac3b9c10aa200c1 at reviews.llvm.org; Gasiunas, Vaidas; ghoflehner at apple.com; atrick at apple.com; llvm-commits at cs.uiuc.edu
Subject: Re: [PATCH] Speed up creation of live ranges for physical registers by using a segment set


On Dec 18, 2014, at 3:18 PM, Quentin Colombet <qcolombet at apple.com<mailto:qcolombet at apple.com>> wrote:


On Dec 18, 2014, at 3:05 PM, Matthias Braun <mbraun at apple.com<mailto:mbraun at apple.com>> wrote:


As I said: Don't let my comments stop you from committing, fixing the performance regressions right now is important, cleaning the code up/avoiding the extra pointer can be dealt with later I guess.


On Dec 18, 2014, at 9:32 AM, Quentin Colombet <qcolombet at apple.com<mailto:qcolombet at apple.com>> wrote:

Hi all,

I agree with Vaidas that moving the set outside of the LiveRange may not be that easy.
Nonetheless we should revisit the code later, I'm still quiet sure that LiveRangeCalc is not used with multiple LiveRanges at the same time as it will break down if you don't reset() it before using it on a new LiveRange...

But the point is that LiveRangeCalc is not the only user of that API, isn't it?
>From what I see in the patch you must call flushLiveRangeSet() eventually after using the set. And the only point where that happens is right after LiveRangeCalc was used to calculate new live ranges for the register units.






Regarding the complexity improvements:


for the createDeadDefs() part it may be interesting to simply append the new segments to the vector in any order and later run a sorting algorithm which also eliminates duplicates on the vector. This should reduce the O(n**2) complexity to O(n log n) and I'd imagine it to perform well in the case where only a few segments are used.


I am not so sure that would be beneficial, indeed, we would have to do this sorting for each live range, i.e., O(m n log n) and I suspect that the (log n) factor will be bigger than the cost of the set for the cases we try to optimize.
So currently we add segments in a random order. As the LiveRange segments sector is kept sorted, this results in an insertion-sort style O(n**2) algorithm for createDeadDefs(), that's why using a set instead of the vector brings this down to O(n log n). However this complicates the code as you suddenly have two datastructures and switch forward and backward between them. I'd guess that is also the reason why using this style for vregs is not beneficial.
Adding the segments in a random fashion to the vector and sorting the complete vector later would give you the same complexity without having to maintain a second datastructure on the side. However this is probably a bigger rewrite as the current LiveRange datastructure does not allow this, I'm also not sure if we have a sorting algorithm that eliminates duplicates at the same time. So this is more of an add to the TODO list proposal.

I am not sure I quite follow the arguments here.
Yes, we do have two data structures, but, unless I am mistaken, we do not switch between them and in particular do not maintain them at the same time.
1. We use the set for the initial creation.
2. We flush the set into the vector.
3. We use the vector.

After #2 the set is not used anymore.

That being said, there is certainly something we can do to improve the complexity of the current code and you may have a better opinion than me on this, since you dug into that recently. Feel free to file PRs with the proposal so that we do not forget about them.
Just having two different datastructures in LiveRange already makes you wonder constantly which of the two is relevant at a certain point in the code.

And yes the current code (before the patch) already feels too complicated for what it actually does, part of the problem is that bits and pieces are reused for live range splitting logic... I'll file a PR once the patch is in.

Greetings
            Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141219/f2bd88d4/attachment.html>


More information about the llvm-commits mailing list