[llvm-dev] Understanding and Cleaning Up Machine Instruction Bundles

Thu Oct 27 17:05:38 PDT 2016

> On Oct 27, 2016, at 2:33 PM, Matthias Braun <mbraun at apple.com> wrote:
> 
> == BUNDLE instruction / operands ==
> For many backend passes a bundle can appear as a single unit. However one important tool
> here is having an iterator over all operands of this unit.
> 
> The original RFC indicates that to achieve this without changing a big number
> of passes an additional bundle instruction is added in front of the bundle. A
> copy of all register/regmask machine operands is added to this header
> instruction. Internal reads inside the bundle are marked as such. This step is
> called finalization of a bundle.

That RFC was written before instructions had their own bundling flags.

http://lists.llvm.org/pipermail/llvm-dev/2011-December/045906.html

> The system works because the default basic block iterator moves from bundle to
> bundle skipping the instructions inside the bundle. Iterating over the operands
> will only give us the operands of the BUNDLE instruction but that is fine,
> because it basically has a copy of everything inside the bundle.

The BUNDLE instruction simply isn’t necessary to do anything you just described.

> == When to finalize bundles; Remove the FinalizeMachineBundles pass? ==
> 
> However there is a number of remaining questions/confusion: The RFC indicates
> that the finalization step is done as a separate pass at the end of the
> register allocation pipeline. In fact a FinalizeMachineBundles pass exists but
> is not used by anyone. There is no in-tree target doing bundling before
> register allocation, the one out of tree target I am aware of finalizes bundles
> immediate after constructing them and is not using the separate pass.

That is a 4-5 year old bootstrapping pass to defer updating post-RA passes to use newer bundle operand iterator.

> In fact I am not sure why you would even wait with the finalization and do it
> in a separate pass rather than doing it immediately after forming the bundle.
> Using the pass today does not even work as the MachineVerifier will reject the
> intermediate unfinalized state (missing internal read markers). I'd suggest to get
> rid of the pass and the idea of delegating finalization to an own pass, any objections?

Adding a BUNDLE instruction and duplicating operands doesn’t make sense in the presence of virtual registers and live intervals.

The questions is not “why do we wait to insert BUNDLEs?”

The question is “Why do we ever insert BUNDLEs:.

> == Too many different iterators ==
> 
> Another source of confusion even for experience register allocation developers
> is that we have 3 kinds of iterators for MachineOperands:
> 
>  - There is MachineInstr::iterator which is used by the majority of passes and
>    gives you the operands of a single instruction.
>  - There is (Const)MIOperands which appears to be equivalent to
>    MachineInstr::iterator. I think we do not need a 2nd iterator and should get
>    rid of this one (the only real reason to use it today is
>    analyze{Virt|Phys}Reg() but that can be changed).
>  - There is (Const)MIBundleOperands which iterates all machine operands of all
>    instructions inside a bundle.

A pass needs to know whether it’s cares about bundles or instructions.
I don’t understand how adding an extra BUNDLE instruction does anything to solve this problem or make the MIR more robust. 

A pass that cares about liveness, dependencies, instruction insertion or reordering needs to work on bundles.
Machine-independent passes should probably work on bundles.

By default, passes now use the bundle iterator for instructions and non-bundle iterator for operands. That allows passes to limp along in the presence of bundles without actually handling the bundles. I think the bundles will just silently defeat optimizations. It’s not safe, but it’s not too badly broken either.

The MIBundleOperands iterator simply makes more sense to me than the BUNDLE instruction. It seems straightforward to migrate passes to the new iterator, but it’s a lot of places that need updating.

> The last one appears to be necessary in a world without the initial BUNDLE
> instruction repeating all the operands inside the bundle. In a setting where
> finalization happens as a separate pass at the end of register allocation this
> would be necessary for earlier register allocation passes.
> 
> However given that delaying finalization to a pass appears broken/unused it
> seems we could just as well use MachineInstr::iterator instead and remove
> MIBundleOperands. Any objections?

IIUC, live intervals, the register allocator, and the scheduler already handle bundles.

I’m fairly sure that adding new vreg uses is not what we want to do.

> == Moving to a scheme without repeating the operands in the bundle header ==
> 
> I've heard some comments that the long term plan was to move to a scheme where
> the operands inside the bundle are not repeated in a bundle header and instead
> everyone uses an iterator like MIBundleOperands. I could not find any mails
> documenting this, so it would be nice if some people could chime in here if
> that was indeed the plan.
> 
> Even with this long term plan in mind I would suggest to remove
> MIBundleOperands. If we implement this plan we should rather change
> MachineInstr::iterator later instead of being in the confusin in-between state
> that we have today.
> 
> - Matthias

I’m not sure what you mean by changing MachineInstr::iterator. You mean mop_iterator?

You can’t replace an instr iterator with a bundle iterator without breaking some basic invariants:
MI == MI->operands_begin()->getParent()

That’s why passes should explicitly ask for the bundle operands.

-Andy