[LLVMdev] RFC: Pass Manager Redux

Wed Jul 11 03:41:50 PDT 2012

Hi Chandler, this seems sound to me.  For example, consider running function
passes.  Currently it works like this: if you schedule two function passes in
succession, FP1 and FP2, then for each function F, FP1 is run on F then FP2 is
run on F.

In your new scheme, if you schedule FP1 followed by FP2, then each will act as
a module pass and thus: for each function F, FP1 is run on F.  Once this is
done, then for each function F, FP2 is run on F.

Two get the previous scheduling, you would create a function pass manager FPM,
which would itself be a function pass, and add FP1 and FP2 to FPM.  When FPM
is run on a function, what it would do is: run FP1 on the function, then run
FP2 on the function.  Scheduling FPM, which would act as a module pass, would
thus do: for each function F, run FPM on it.  I.e. this would do: for each
function F, run FP1 on F then run FP2 on F.  In short, you can easily implement
the current behaviour in a simple, natural and explicit way.

Ciao, Duncan.

On 11/07/12 11:50, Chandler Carruth wrote:
> Greetings folks!
>
> In working on a new optimization pass (splitting cold regions into separate
> functions based on branch probabilities) I've run into some limitations of the
> current pass manager infrastructure. After chatting about this with Nick, it
> seems that there are some pretty systematic weaknesses of the current design and
> implementation (but not with the fundamental concepts or behavior).
>
> Current issues:
>
> - Arbitrary limitations on how passes can depend on an analysis: module passes
> have a gross hack to depend on function pass analyses, and CGSCC passes are just
> out of luck.
>
> - Poor caching of analysis runs across pass manager boundaries. Consider the N
> iterations on the SCC and the function passes run by the CGSCC pass manager.
> Even if every CG and function pass in the manager preserves a function pass
> analysis X, that pass will be re-run on each iteration of the SCC because it is
> scheduled in the function pass manager, not the CG pass manager. If we solve the
> previous item, that will make this one a *serious* problem suddenly.
>
> - The structure of the pass management tree is very non-obvious from code. The
> pass manager builder simply adds a linear sequence of passes that happens to
> build the appropriate stacks of passes at the appropriate times.
>
> - Currently the interfaces and implementation of passes and pass managers is
> overly complex: multiple inheritance, lots of virtual dispatch etc. Doesn't use
> the canonical LLVM 'isa' and 'dyn_cast' techniques.
>
>
> I'd like to fix these issues and build a new pass manager implementation
> designed around the following core concepts:
>
> - We should have clear tracking and statistics for pass run count, analysis
> invalidation, etc.
>
> - Analysis scheduling and re-use is fundamentally a *caching* and dependency
> problem, and we should structure it as such.
>    - Non-preserving passes should invalidate the cache
>    - The cache should be capable of spanning any particular pass management
> boundary when needed.
>    - We should be able to trade memory for speed and cache more analyses when
> beneficial.
>    - The infrastructure should at least *support* a lazier approach to analyses,
> so that we can do more to avoid computing them at all.
>
> - PassManagerBuilder should use an explicit nested syntax for building up the
> structure of the passes so it is clear when a pass is part of a CGSCC pass
> manager, or when it is a normal function pass.
>
> - Clear hierarchy of "Pass" interfaces. Every pass should be capable of
> acting-as-if it is a higher level pass. That is, a function pass should be
> capable of acting-as-if it is a module pass just by running over all functions
> in the module. That doesn't mean this should regularly be *used*, but it makes
> conceptual reasoning about passes and testing of passes much more clear.
>
> - PassManagers should *be* passes, and serve as pass-aggregation strategies and
> analysis caching strategies. Where a function pass *can* act as a module pass,
> you usually instead want a function pass manager, which will collect a sequence
> of function passes and run all of them over each function all at once. This
> aggregation strategy increases locality. Similarly, a CGSCC pass manager
> aggregates the pass runs over an SCC of the call graph. Each pass manager could
> influence the caching strategy as well, for example the CGSCC pass manager might
> cache a function analysis pass over an entire SCC, rather than just over one
> function.
>
> - Single, LLVM-style inheritance model for the whole thing.
>
> - Users should be able to add new pass managers, and compose them cleanly with
> the LLVM-provided pass managers. Currently, many implementation details are
> burried that makes this hard.
>
> - What else did I miss?
>
>
> Things that are totally out of scope: pass registration, the current pass order
> and structure, the fundamental interface of mapping from a pass to an analysis,
> etc. This is really about pass management and scheduling.
>
>
> If folks generally like where this is going, I'll get to work writing up initial
> code. The first thing I would do is provide an example collection of interfaces
> for the passes and pass managers to make sure we're all on the same page. By
> then I should have a decent idea about the best strategy for cutting this into
> the actual codebase.
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>