[PATCH] D32563: Add LiveRangeShrink pass to shrink live range within BB.

Tue May 2 17:12:32 PDT 2017

wmi added a comment.

In https://reviews.llvm.org/D32563#743962, @atrick wrote:

> Ideally there should be a separate pass that runs on the SSA machine code (before register coalescing) to minimize register pressure and hide latency for chains of loads or FP ops. It should work across calls and loop boundaries. You could even do a kind of poor-man's cyclic scheduling this way.
>
> MISched works at the lower level of instructions groups and CPU pipeline hazards. It would be nice if MISched worked at the level of extended basic blocks (it would be easy to implement and has been done out of tree). I don't think it makes as much sense for it to work across call sites though. That is not hard to implement but seems it will generate large DAGS and will be bad compile-time tradeoff.
>
> MISched is not a scheduling algorithm, it's a scheduling framework. The generic scheduler is a pile of heuristics that exercise most of the functionality and seems to be working ok for several popular targets. The strategy that it takes is:
>
> - Make a single scheduling pass handling all heuristics at once. Don't reorder the instructions at all unless the heuristics identify a register pressure or latency problem.
> - Try to determine, before scheduling a block, whether register pressure or latency is likely to become a problem. This avoids the scheduler backing itself into a corner (we don't want the scheduler to backtrack).
>
>   You'll notice that this is very conservative with respect to managing compile time and preserving the decisions made by earlier passes.
>
>   You could follow that basic strategy and simply adjust the priority of heuristics for your target. You can add better up-front analysis to detect code patterns for each block before prioritizing heuristics. Or you could implement a completely different strategy. For example, schedule for register pressure first, then reschedule for ILP.
>
>   I probably won't be able to help you must more than that. Matthias has taken over maintenance of MISched. I think it would help if you give more background on your situation (sorry if I haven't paid attention or have forgotten). Is this PowerPC? Inorder/out-of-order?

Thank you so much for introducing the design philosophy of MISched. It really helps.

About the background on our situation, the problem we are trying to solve is that reassociation pass introduces a lot of live range interference unnecessarily because of the way how it inserts the intermediate results. For the testcase exposing the problem, it is quite special in that the instructions we care about are all in the same BB. Most of the defs have only one use and most of the uses have only one def. So Dehao proposed a limited live range shrinking pass in llvm IR phase to solve it. We may extend the live range shrinking in the future because we have seen other cases before that would be benefited from more general live range shrinking.

Sanjay and Andrea made us notice that scheduling may already achieve the same result as the shrinking pass we proposed, at least for the motivational testcase, so we looked into MISched. Dehao has found that even we enabled existing generic scheduling to cross call, the more complex DAG can lead to a better but still unoptimal solution, and the unoptimal issue is not easy to solve without a global picture of the whole basicblock. This makes us incline to implement it in a separate pass.

The problem was found on x86, which is the target we care the most right now, but I think the problem is platform independent.

https://reviews.llvm.org/D32563