[llvm-dev] [RFC] Introducing classes for the codegen driven by new pass manager

Fri Jul 10 19:44:05 PDT 2020

(NPM: new pass manager;  LPM: legacy pass manager)

Hello, community

While we're still working towards using NPM for optimizer pipeline by default, we still don't have a machine pass interface and the corresponding machine pass manager using NPM. The potential benefits using NPM aside, this inhibits us from making any progress on deprecating LPM for the codegen pipeline which blocks removing LPM altogether. The purpose of this series of patches is to (1) let pass developers write or port machine passes to a new machine pass interface and be able to test it with `llc`. (2) let a target have the choice of implementing the codegen pipeline using NPM (Work-in-Progress). Maybe it is obvious already, but I also want to mention that these patches do not intend to force a target to migrate to NPM right way.

It would be also great to know how many people want this to happen and any comments/suggestions are greatly appreciated. Thanks a lot in advance!

** Goal **
(1) a machine pass interface for NPM that existing machine passes could be ported to.
(2) a machine pass manager
(3) a codegen pipeline pass builder with NPM
(4) llc-driven codegen pipeline with NPM (`llc -enable-new-pm`). Existing command-line flags supported by `llc` should continue to work when `-enable-new-pm` is specified.

** Non-Goal (community effort) **
 - port existing codegen passes to NPM. This needs community effort and domain expertise to do correctly and efficiently. This includes target-independent codegen IR/machine passes and target-specific codegen IR/machine passes.
 - 100% codegen feature parity between LPM and NPM. It's very likely that I did not port some features in this work. A few I'm aware of are `MachineFunctionProperties`, `-debugify-and-strip-all`, and "machine pass plugins". However, if the community is convinced this work is the right design, these missing features could be added by interested parties.

** Implementation / Design Choices **

* Goal-1 *
https://reviews.llvm.org/D67687

Four member methods of a machine pass are recognized by the machine pass manager:
(1) `PreservedAnalyses run(MachineFunction &, MachineFunctionAnalysisManager &)`. Majority of the machine passes use this.
(2) `Error doInitialization(Module &, MachineFunctionAnalysisManager &)`. Passes like AsmPrinter needs a hook to lower/transform global constructs. (invoked before all passes `run` method)
(3) `Error doFinalization(Module &, MachineFunctionAnalysisManager &)`. Client: PrintMIRPass.  This is also for completeness. (invoked after all passes `run` method)
(4) `Error run(Module &, MachineFunctionAnalysisManager &)`. Client: MachineOutliner, DebugifyMachineModule. I would call this machine module pass which needs a global scope. It is like (1) but subject to pass ordering. Currently, a pass either define this or (1), but not both.

(doInitialization/doFinalization is currently not supported by the NPM optimizer pipeline because there is no real need for it.
http://lists.llvm.org/pipermail/llvm-dev/2018-September/126504.html)

* Goal-2 *
https://reviews.llvm.org/D67687

Unlike IR where `has-a` relationship exists among module/function/loop/cgscc etc., the MIR does not have `has-a` relationship with any kind of IR. It does have a one-on-one relationship with IR function. So, transforming MIR does not change any IR unit at all. Invalidating a MIR analysis result also does not invalidate any IR analysis result.

Based on the above observation, the machine pass manager runs standalone, i.e. combining it with other types of pass managers using adaptor passes are not supported. There is also no proxy defined for machine analysis manager with any other types of analysis managers. The machine analysis manager does provide API to register and query IR analysis because machine passes need IR analysis result in general.

* Goal-3 *
https://reviews.llvm.org/D83608
https://reviews.llvm.org/D83613  (X86 part)

The codegen pipeline is different from optimizer pipeline in that it is customizable by a target in the granularity of (1) portions of a pipeline, (2) insert/replace/disable individual pass. Currently, the main class for this job is `class TargetPassConfig`.

`class TargetPassConfig` could be broken down into three parts:
(1) a polymorphic class for a target to customize codegen pipeline
(2) a bunch of member functions and variables to support debugging features (MIR print/verify, llc -start-before/-stop-before/-start-after/-stop-after, etc.)
(3) a legacy immutable pass to declare hooks for targets to customized some target-independent codegen layer behavior.

A new mixin called `class CodeGenPassBuilder` is introduced to replace `class TargetPassConfig`(1), the naming is intended to be similar to its IR counterpart: `class PassBuilder`. `class CodeGenPassBuilder` is CRTP mixin instead of polymorphic class because it brings less layering, flexibility for targets to customized, inlining opportunity and fits the overall NPM value semantics design. There are potential compile-time issues, but I don't think it is a major concern at this moment. The design is not intrusive to make a change inhibitive.

`class TargetPassConfig`(2) is implemented using PassInstrumentation as much as possible because PassInstrumentation is commonly used for debugging features already. This makes `class CodeGenPassBuilder` only do what it is supposed to do. PassInstrumentation is also more flexible than inserting debugging passes into the pipeline.

`class TargetPassConfig`(3) is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future.

Since the machine pass manager runs standalone and codegen process includes both IR pipeline(codegen prepare, EH, etc.) and MIR pipeline, the new codegen pipeline uses a pair of module pass manager and machine pass manager.

Like `PassRegistry.def`, there is a `MachinePassRegistry.def` which includes target-independent codegen IR/machine passes. Passes that not yet ported not NPM have a dummy no-op implementation (to help testing). Target would have their own pass registry file. For X86, it is `X86PassRegistry.def`.

* Goal-4 *
https://reviews.llvm.org/D83610 (TargetMachine API)
https://reviews.llvm.org/D83612 (llc)

`llc` option compatibility is very important for LIT tests migration. So `llc -enable-new-pm` accepts almost all options `llc` accept. 

** Testing **
- Since the `llc` options are compatible, as passes are ported to NPM and various issues got resolved, we should see more tests passing when `llc -enable-new-pm` is turned on implicitly via an (maybe) knob similar to `cmake -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER`.
- A buildbot to make sure no regression when `llc -enable-new-pm` is implicitly on?
- Any idea on this regard is much appreciated.

** Work in progress **
Have a minimum codegen pipeline to produce a valid assembly file by porting target-independent passes such as AsmPrinter.cpp, SelectionDAGISel.cpp etc. Currently, target-independent codegen layer (SelectionDAGISel, AsmPrinter etc..) are coupled with LPM tightly.

** Miscellaneous **
- There is a need to force codegen to run according to the callgraph. Currently, this is achieved by adding a dummy CGSCC pass as the container pass of codegen passes. In NPM, this could be implemented by pre-sorting IR functions by callgraph order in the machine pass manager.
- MachinePassRegistry/RegisterRegAlloc/RegisterScheduler. These are for dynamic registering machine passes and also used for in-tree machine passes. Personlly I don't think this is needed and similar opinions were expressed before (https://reviews.llvm.org/D54203#1291969). We should probably just register them as normal passes.
- Timing machine passes. (StandardInstrumentation handle this already)