[PATCH] D67687: [NewPM][CodeGen] Introduce machine pass and machine pass manager

Mon Jul 20 21:16:40 PDT 2020

ychen added a comment.

Thanks for chiming in @aeubanks.

In D67687#2162962 <https://reviews.llvm.org/D67687#2162962>, @aeubanks wrote:

> After thinking about this some more, I still don't understand why `MachineFunction`s aren't modeled as contained within a `Module` and don't really see the benefit of all this custom logic. Looking at some legacy passes like `MachineOutliner`, it's a `ModulePass` that finds all machine functions itself through the module.

`MachineOutliner` doesn't really need to be a module pass i.e. working on a Module. It just needs a view of all machine functions. Considering it is a rare need and the legacy pass manager does not support this directly. Using `ModulePass` is a workaround.

> The passes that you gave as examples that have some sort of `doInitialization`/`doFinalization` could just become a module pass (I'm thinking about doing that for `objc-arc` which has a `doInitialization`)?

I think very likely yes. But as I mentioned in the thread, they have some differences in when they are got called. I'm just being defensive here that defining `doInitialization`/`doFinalization` in the initial porting phase. If afterward, we have a lot of things working and `doInitialization`/`doFinalization` turns out to be not needed, we could remove them easily. But still, I'm open to not keeping it if that's the consensus.

> Maybe I'm missing something, but going toward the route of modeling modules as containing machine functions and reusing more of the existing infra and less custom stuff specifically for codegen seems better. Then the only thing you'd have to do is add some pass manager/analysis manager typedefs, create a `ModuleToMachineFunctionPassAdaptor`, and add some proxies to `PassBuilder::crossRegisterProxies`.
>  That would make it less confusing and more consistent than having two different `run()` methods here.
>  And it would make it easier to share analyses between the different pipelines, by adding the machine function pass manager as part of the overall module pass manager (e.g. in PassBuilder.cpp).
> 
> You mentioned
> 
>> Because their data structures are separated. Logically they are representations of the same/similar thing at different abstraction level.
> 
> in the email thread. Do you mean that there should be some sort of `MachineModule`, instead of `Module` corresponding to both IR and MIR?

By saying "their data structures are separated" I meant that modifying/transforming MIR does not cause any change to IR. MIR does not belong to IR class hierarchy. It is not like changing an IR loop also meaning changing its containing IR function or module.

Also, in MIR, there is basically one kind of IR unit (in NPM terminology): machine function. IR has many kinds of IR units: basic block(relatively small scope not so useful for optimization), loop, function, cgscc, module.

In the MIR pipeline (what this patch is about), the IR is not changed and not ran. So all IR analyses results are preserved in MIR pipeline. Modifying MIR could only invalidate MIR analysis results but not IR analyses results.

Based on the above, the requirement for MIR pipeline about IR is that 1) we don't run any IR transformation pass; 2) we ran machine pass initiated IR analyses; 3) we could not invalidate IR analyses.

Defining a proxy class does not do any actual work(the invalidation part is not needed, the rest are just querying the delegated pass manager) although it is definitely conceptually appealing. Likewise, a potential adaptor class (maybe only `ModuleToMachineFunctionPassAdaptor` as you mentioned) is possible but other than iterating MIR functions from an IR module, it could not do anything else that an adaptor class is supposed to do like aggregating passes preserved analyses and report it back to containing pass manager. The MachineFunctionPassManager is roughly `ModuleToMachineFunctionPassAdaptor` + `PassManager<MachineFunction, MachineFunctionAnalysisManager>` combined.

In conclusion, it is possible to use proxy and adaptor classes here and make things work. But IMHO it is unnecessary coupling into IR pass manager infra. Although it is appealing conceptually, it could also cause confusion. Actually, without the `doInitialization`/`doFinalization` stuff here, all existing code is necessary even if we changed to use proxy and adaptor classes.

Please let me know if this makes any sense to you. :-)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67687/new/

https://reviews.llvm.org/D67687