[llvm-dev] loop transforms and function analyses; especially divergence analysis

Wed Feb 17 21:02:46 PST 2021

---- On Thu, 18 Feb 2021 01:10:51 +0530 Alina Sbirlea <alina.sbirlea at gmail.com> wrote ----

 > IMO there are only two options that make sense as far as the cost involved.
 > The first, most straight-forward and cheap but a fairly big hammer, is to skip non-trivial unswitching for targets with divergence, as Arthur suggested.
 > The second, more expensive, is to compute DA inside SimpleLoopUnswitch inside the `if (TTI.hasBranchDivergence())` clause, before defaulting to `return false`.

Yes, those are the only options available under the current state of the new pass manager, but neither is very attractive. We end up not doing anyting, or recomputing a function analysis on every loop. They are both just workarounds for functionality that was not ported over from the old PM to the new PM.

I am not sure if you had a chance to consider my comments on the review, but perhaps this is a better place to continue to that discussion:

In the legacy pass manager, the function LoopPass::preparePassManager() actually makes sure that if a loop transform T running inside a loop pass manager M invalidates analyses used by other passes in M, then T is split out into a separate loop pass manager M'. Thus every instance of loop pass manager is responsible for making sure that the required function analyses are recomputed before starting any loop passes. There doesn't seem to be a way to do something similar in the new PM. A loop pass should be able to isolate itself from side-effects of other loop passes on function analyses. The list of standard analyses serves as a good contract for heavily used analyses, but divergence analysis is not used frequently enough to justify adding it there.

The missing functionality can be reimplemented as follows: 

1. Allow a loop pass to declare required function analyses that are not part of the standard list, by returning a list in some function, say "getRequiredOuterAnalyses()". For divergence analysis, this function needs to check TTI because the requirement is target dependent. We can do that by passing the standard analyses as an argument.

2. Rather than adding passes to an LPM directly, an outer manager should add them through a proxy. This proxy can own more than one LPMs and has the ability to enqueue function analyses between these LPMs.

3. Whenever addPass() is called on this proxy, it should check the loop pass to see if it requires any non-standard analyses. If yes, it should start a new LPM and enqueue the required analyses before the new LPM.

This is pretty much what the old PM does to handle such a dependency.

The class FunctionToLoopPassAdaptor is a good candidate ... it already does non-trivial things like running a canonicalization pipeline and is responsible for iterating over loops. It seems like the right place to split the flow into multiple LPMs.

Sameer.