[LLVMdev] PM: High-level review of the new Pass Manager (so far)

Tue Jun 17 17:54:37 PDT 2014

Hi Chandler,

This is a high-level review of the new WIP `PassManager` infrastructure.

For those that haven't dug into Chandler's commits, here's a very
high-level overview (assuming IIUC):

  - The driver supports simple declarative syntax for specifying passes
    to run.  E.g., `module(a,b,function(c,d),e)` runs module passes `a`
    and `b`, then function passes `c` and `d` for each function, and
    then module pass `e`.

  - There is a new concept of an AnalysisManager (AM), which runs
    analyses on-demand and caches the results.

  - Analysis passes are split conceptually from transformation passes.
        - Analysis:       run: (IRUnit*, AM*) -> Result
        - Transformation: run: (IRUnit*, AM*) -> PreservedAnalyses

    Note that neither of these matches the old interface, which was
    runOnIRUnit: IRUnit* -> bool.

  - PassManagers interoperate via adaptors.  E.g.,
    ModuleToFunctionPassAdaptor is a module transformation pass that
    contains a FunctionPassManager (with some set of function passes).

  - AnalysisManagers interoperate via proxies.  E.g.,
    FunctionAnalysisManagerModuleProxy is a module analysis pass that
    forwards to a FunctionAnalysisManager.

  - LazyCallGraph and ModuleToPostOrderCGSCCPassAdaptor collude to visit
    SCCs in post-order, including API for updating the SCC-graph
    on-the-fly without invalidating the traversal.

I think what's done generally looks great.  In particular, the
declarative syntax is awesome and it's a huge step to have an
AnalysisManager that can reason about analyses separately from the pass
pipeline.  I haven't analysed the algorithms in LazyCallGraph carefully,
but I think it's great that they keep the traversal valid amid a
changing call graph.

Questions and concerns:

 1. PassConcept requires a `name()` for passes.  Why don't
    AnalysisPassConcept and AnalysisResultConcept?

 2. AnalysisManagerBase defines two versions of `invalidate()` -- one of
    which takes a `Module*` (instead of `IRUnitT`).  It's not clear to
    me why the analysis managers (other than ModuleAnalysisManager) need
    this.  What am I missing?

 3. The number of lines of code to enable transformation pass managers
    (and analysis managers) to interoperate seems to scale quadratically
    in the number of types of IRUnit.  Am I reading this wrong?  Is
    there a plan to manage that, or is it an open problem?  (So far the
    infrastructure only supports Module, SCC, and Function.  What
    happens (e.g.) when we add BasicBlock passes?)

 4. Previously, (e.g.) function analysis passes would typically hold
    results only for one function -- when the next function was
    analyzed, the previous function's results would be invalidated.
    Now, the FunctionAnalysisManager will happily store results for
    every function.  (Looking forward, BasicBlockAnalysisManager might be
    able to store results for every BasicBlock.)

    This is powerful, but costs memory.  What's the plan for keeping
    memory usage in check (especially for large modules)?

 5. In the old infrastructure, `CGPassManager::runOnModule()` (in
    `CallGraphSCCPass.cpp`) has a do-while loop that re-runs all passes
    on a given SCC as long something new gets devirtualized, up to a
    maximum number of iterations.  I don't see the equivalent loop in
    CGSCCPassManager, and I don't see an obvious way to accomplish the
    same optimization cycle.

      - How will we re-implement this loop?  (Or did I miss it?)
      - There's a potential infinite loop between devirtualization and
        inlining on recursive virtual functions.  How will we avoid it?

 6. There aren't any machine-level pass managers yet.  Will they look
    about the same?  (Are there open problems to solve there?)

 7. Andy asked in a separate thread [1] (my words): "Does this
    infrastructure allow for invalidation of classes of analyses?"
    (E.g., invalidate all analyses whose results depend on the
    control-flow graph.)  Your response there [2] was (my words): "Not
    directly, but `AnalysisManager` will be a good place for that API."

    I thought a little about this, and it's not clear to me how to fit
    this API into AnalysisManager cleanly.

    One side seems straightforward.  Assuming you have a transformation
    pass that modifies the CFG, tell the AnalysisManager that you've
    modified the CFG, and return that all analyses have been preserved
    (since everything else has).  This side is easy to make opt-in.

    The other side is not opt-in, though.  *All* analyses that depend on
    the CFG must respond, or the invalidation mechanism doesn't work.

 8. `CGPassManager` was one place where the old return value for a
    transformation pass (`bool` representing "did-change") was useful.
    Are there any others?  Is the API using `PreservedAnalyses::all()`
    as a stand-in for "did-not-change"?  If so, does that conflict with
    adding more nuanced API for on-the-fly invalidation of analyses?

-- dpnes

[1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-June/073767.html
[2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-June/073785.html