[llvm-dev] [GlobalISel] A Proposal for global instruction selection

Mon Nov 30 12:57:53 PST 2015

----- Original Message -----

> From: "Quentin Colombet" <qcolombet at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Monday, November 30, 2015 1:34:20 PM
> Subject: Re: [llvm-dev] [GlobalISel] A Proposal for global
> instruction selection

> Hi Hal,

> The alias information is a good example of the MI to IR back links,
> thanks for pointing that out.

> > On Nov 26, 2015, at 12:58 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 

> > Hi Quentin,
> 

> > First, thanks a lot for working on this! This is obviously a
> > really-important problem.
> 

> > One thought:
> 

> > + /// *** This is to support:
> 
> > + /// *** Self contained machine representation, no back links to
> > LLVM IR.
> 
> > + /// Import the attribute from the IR Function.
> 
> > + AttributeSet AttributeSets; ///< Parameter attributes
> 

> > I fully support better modeling of functions without ties to
> > IR-level
> > functions. This will allow very-late outlining, multiversioning,
> > etc., and there are good use cases for these things. That having
> > been said, I think we should have a narrower scope for severing MI
> > <-> IR ties, because at least one important link will continue to
> > exist: MMOs used to provide access to IR-level alias analysis. This
> > is critical to good instruction scheduling, memory-access merging,
> > etc. and replicating AA at the MI level is not feasible.
> 

> Honestly, although I understand why we have the MMOs right now, I
> don’t think this is a clean design and I would rather have an AA
> working at MI level or, better, a different way of passing the
> information to MI.
> I don’t have something in mind on how to pass the information if we
> choose that path, but I think it would be important to get rid of
> the MI -> IR link for aliases purposes, because we end up with,
> IMHO, ugly code where a Machine pass patches the IR to fix the alias
> information. E.g., in the stack coloring pass:

> // AA might be used later for instruction scheduling, and we need it
> to be
> // able to deduce the correct aliasing releationships between
> pointers
> // derived from the alloca being remapped and the target of that
> remapping.
> // The only safe way, without directly informing AA about the
> remapping
> // somehow, is to directly update the IR to reflect the change being
> made
> // here.
> Instruction *Inst = const_cast<AllocaInst *>(To);
> if (From->getType() != To->getType()) {
> BitCastInst *Cast = new BitCastInst(Inst, From->getType());
> Cast->insertAfter(Inst);
> Inst = Cast;
> }

> Therefore, I would prefer having the alias information expressed as
> something decoupled from the IR and that could be updated.

> What do you think?
I certainly agree that it is ugly, and as the author of the comment you've highlighted above, I would love to have a better solution. The only problem is that I don't have a good idea how such a solution might work; prerecording all N^2 possible aliasing queries is impractical. I don't think that rewriting our current AA to work on MI, or even refactoring the current AA logic to work in terms of abstractions over both IR and MI is really possible, because we need it to function even after MI has dropped out of SSA form and PHI elimination has happened. 

That having been said, prerecording query results still seems like the best solution, but it can't be naive (N^2). We have to understand the constraints that the query results will only be used to disambiguate otherwise-undecidable aliasing queries for the purpose of doing merging, scheduling, etc., and maybe within those use cases, we can constrain the problem enough to make prerecording the query results practical. 

The bad news, however, is that prerecording the query results does not remove the comment in stack coloring, but just changes it to discuss operating on some MI-level data structure that is not the IR. 

Thanks again, 
Hal 

> Cheers,
> -Quentin

> > -Hal
> 

> > ----- Original Message -----
> 

> > > From: "Quentin Colombet via llvm-dev" < llvm-dev at lists.llvm.org >
> > 
> 
> > > To: "llvm-dev" < llvm-dev at lists.llvm.org >
> > 
> 
> > > Sent: Wednesday, November 18, 2015 1:26:37 PM
> > 
> 
> > > Subject: [llvm-dev] [GlobalISel] A Proposal for global
> > > instruction
> > 
> 
> > > selection
> > 
> 

> > > Hi,
> > 
> 

> > > With this email, I would like to kick-off the development for the
> > 
> 
> > > next instruction selector that I described during the last LLVM
> > > Dev’
> > 
> 
> > > Meeting.
> > 
> 
> > > For the motivations, see Jakob’s proposal (
> > 
> 
> > > http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/064727.html
> > > )
> > 
> 
> > > and for the proposal, see the slides (Keynote:
> > 
> 
> > > http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.key?view=co
> > 
> 
> > > or PDF:
> > 
> 
> > > http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.pdf?revision=252430&view=co
> > 
> 
> > > ) or the talk (
> > 
> 
> > > https://www.youtube.com/watch?v=F6GGbYtae3g&list=PL_R5A0lGi1AA4Lv2bBFSwhgDaHvvpVU21&index=2
> > 
> 
> > > ).
> > 
> 

> > > TL;DR This is happening now, feedbacks invited!
> > 
> 

> > > *** Context ***
> > 
> 

> > > During the last LLVM Dev’ Meeting, I have presented a proposal
> > > for
> > 
> 
> > > the next instruction selector, GlobalISel. The proposal is
> > > basically
> > 
> 
> > > summarized in "High Level Prototype Design” and “Roadmap”. (If
> > > you
> > 
> 
> > > want further details, feel free to reach me.)
> > 
> 

> > > The first step of the development plan is to prototype the new
> > 
> 
> > > framework on open source. The idea is to start prototyping now(!)
> > 
> 
> > > and have the discussion ongoing in parallel. The reason of such
> > 
> 
> > > approach is to have code that can be used to inform those
> > 
> 
> > > discussions, e.g., by collecting data and trying different
> > > designs
> > 
> 
> > > approaches. Regarding the discussion, I have listed a few points
> > 
> 
> > > where your feedbacks would be particularly appreciated (see
> > > Feedback
> > 
> 
> > > Invite).
> > 
> 

> > > Also, as I have mentioned in my talk, some issues are
> > > controversial
> > 
> 
> > > but I expect them to be resolved during prototype development.
> > 
> 
> > > Specifically theses concern aspects of legalization (should parts
> > > of
> > 
> 
> > > it be done at the LLVM IR level or all at the MI level?) and code
> > 
> 
> > > re-use for instruction combiner. Please feel free to bring up
> > > your
> > 
> 
> > > specific concern as I move along with the development plan.
> > 
> 

> > > I expect the design to evolve with our experimental findings and
> > > your
> > 
> 
> > > feedbacks and contributions.
> > 
> 
> > > Nonetheless, we expect to nail down some design decisions once
> > > and
> > 
> 
> > > for all as the prototype progresses. I have highlighted them with
> > 
> 
> > > the following pattern [final] .
> > 
> 

> > > *** Feedback Invite ***
> > 
> 

> > > If you follow and support this work you need to be aware of three
> > 
> 
> > > things and I am eager to hear your feedback and thoughts about
> > > them:
> > 
> 
> > > the overall goals of Global ISel, the goals of the prototype, and
> > 
> 
> > > the impact of the prototype work on backend design.
> > 
> 

> > > In the section “Goals", I defined (repeated for people that saw
> > > the
> > 
> 
> > > talk) the goals for the Global ISel design.
> > 
> 
> > > - Do you see anything missing?
> > 
> 
> > > - Do you see something that should not be there?
> > 
> 

> > > The prototype will answer critical design questions (see “Design
> > 
> 
> > > Questions the Prototype Addresses at the End of M1" for examples)
> > 
> 
> > > before the actual design of Gobal ISel is finalized, but it
> > > cannot
> > 
> 
> > > cover everything.
> > 
> 
> > > Specifically we will *not* look into improving TableGen or reuse
> > 
> 
> > > InstCombine (see “ Proposed Approach” for the rational). Please
> > > let
> > 
> 
> > > me know if you see any issue with that.
> > 
> 

> > > There is also basic ground work needed to prepare for Global ISel
> > > and
> > 
> 
> > > I need to extend the core MachineInstr-level APIs as explained
> > 
> 
> > > during the talk. For this, I prepared sketches of patches to
> > 
> 
> > > illustrate them and describe the details in the “Implications”
> > 
> 
> > > section below. Please have a look at the patches to have a better
> > 
> 
> > > idea of the expected impact.
> > 
> 

> > > If there is anything else you want to discuss related to Global
> > > ISel
> > 
> 
> > > feel free to reach me. In particular, several people expressed
> > > their
> > 
> 
> > > interests during the LLVM Dev Meeting in contributing to the
> > 
> 
> > > project. Let me know what is your area of interest, so that we
> > > can
> > 
> 
> > > coordinate our efforts.
> > 
> 
> > > Anyhow, please add [GlobalISel] in the subject line to help
> > 
> 
> > > categorizing the emails.
> > 
> 

> > > *** Goals ***
> > 
> 

> > > The high level goals of the new instruction selector are:
> > 
> 
> > > - Global instruction selector.
> > 
> 
> > > - Fast instruction selector.
> > 
> 
> > > - Shared code path for fast and good instruction selection.
> > 
> 
> > > - IR that represents ISA concepts better.
> > 
> 
> > > - More flexible instruction selector.
> > 
> 
> > > - Easier to maintain/understand framework, in particular
> > 
> 
> > > legalization.
> > 
> 
> > > - Self contained machine representation, no back links to LLVM
> > > IR.
> > 
> 
> > > - No change to LLVM IR.
> > 
> 

> > > Note: The goals are common to all targets. In particular, we do
> > > not
> > 
> 
> > > intend to work on target specific feature for the prototype.
> > 
> 
> > > The bottom line is please make sure those goals are compatible
> > > with
> > 
> 
> > > what you want to achieve for your target, even if your
> > > requirement
> > 
> 
> > > does not get listed here.
> > 
> 

> > > *** Proposed Approach ***
> > 
> 

> > > In this section, I describe the approach I plan to pursue in the
> > 
> 
> > > prototype and the roadmap to get there. The final design will
> > > flow
> > 
> 
> > > out of it.
> > 
> 

> > > For this prototype, we purposely exclude any work to improve or
> > > use
> > 
> 
> > > TableGen or InstCombine [final]. We will keep in mind however,
> > > that
> > 
> 
> > > some of the C++ code we write will be table-generated at some
> > > point.
> > 
> 
> > > The rational is that we do not want to lay down a new
> > 
> 
> > > TableGen/InstCombine infrastructure before being able to work on
> > > the
> > 
> 
> > > ISel framework itself.
> > 
> 

> > > The prototype vehicle will be AArch64 . None of the changes for
> > 
> 
> > > GlobalISel will negatively impact the existing ISel.
> > 
> 

> > > ** High Level Prototype Design **
> > 
> 

> > > As shown in the talk, the expected pipeline for the prototype is:
> > 
> 
> > > LLVM IR -> IRTranslator -> Generic (G) MachineInstr -> Legalizer
> > > ->
> > 
> 
> > > RegBankSelect -> Select -> MachineInstr
> > 
> 

> > > Where:
> > 
> 
> > > - Terms in bold are intermediate representations.
> > 
> 
> > > - Generic MachineInstrs are machine instructions with a generic
> > 
> 
> > > opcode, e.g., ADD, COPY.
> > 
> 
> > > - IRTranslator: Translate LLVM IR to (G) MachineInstr.
> > 
> 
> > > - Legalizer: Legalize illegal (G) MachineInstr to legal (G)
> > 
> 
> > > MachineInstr.
> > 
> 
> > > - RegBankSelect: Assign virtual register with size to virtual
> > 
> 
> > > register with Register Bank.
> > 
> 
> > > - Select: Translate the remaining (G) MachineInstr to
> > > MachineIntr.
> > 
> 

> > > ** Implications **
> > 
> 

> > > As part of the bring-up of the prototype, we need to extend some
> > > of
> > 
> 
> > > the core MachineInstr-level APIs:
> > 
> 
> > > - Need to remember FastMath flags for each MachineInstr.
> > 
> 
> > > - Need to know the type of each MachineInstr. We don’t want ADD8,
> > 
> 
> > > ADD16, etc.
> > 
> 
> > > - Extend the MachineRegisterInfo to support size as well as
> > > register
> > 
> 
> > > classes for virtual registers.
> > 
> 

> > > I have sketched the changes in the attached patches to help
> > > picturing
> > 
> 
> > > how the changes would impact the existing APIs.
> > 
> 

> > > Note: I do not intend to commit those changes as they are. They
> > > will
> > 
> 
> > > go the usual review process in due time.
> > 
> 

> > > The patches contain “// ***”-like comment that give a rough
> > 
> 
> > > explanation on why those changes are needed w.r.t. the goals.
> > 
> 
> > > The order of the patches could be modified since the dependencies
> > 
> 
> > > between those are not sequential. Anyhow, here are the patches:
> > 
> 
> > > 1. Introduce (some of) the generic opcode.
> > 
> 
> > > 2. Make MachineFunction more independent of LLVM IR to eventually
> > > be
> > 
> 
> > > able to delete the LLVM IR instance from the memory.
> > 
> 
> > > 3. Extend MachineInstr to represent additional information
> > > attached
> > 
> 
> > > to generic opcode.
> > 
> 
> > > 4. Teach MachineRegisterInfo about size for virtual registers.
> > 
> 
> > > 5. Introduce a helper class to build MachineInstr related
> > > objects.
> > 
> 
> > > 6. Add new target hooks to lower the ABI directly to
> > > MachineInstr.
> > 
> 
> > > 7. Introduce the IRTranslator pass.
> > 
> 

> > > ** Roadmap for the Prototype **
> > 
> 

> > > We plan to split the prototype in three main milestones:
> > 
> 
> > > 1. Translation: LLVM IR to (G) MachineInstr translation.
> > 
> 
> > > 2. Basic selector: Legal LLVM IR to target specific MachineInstr.
> > 
> 
> > > 3. Simple legalization: Support scalar type legalization and some
> > 
> 
> > > vector instructions.
> > 
> 

> > > Notes:
> > 
> 
> > > - For #1, we will not support any fancy instructions like landing
> > > pad
> > 
> 
> > > or switch.
> > 
> 
> > > - Each milestone should take about 3-4 months.
> > 
> 
> > > - At the end of #2, we would have a FastISel like selector.
> > 
> 

> > > Each milestone will be detailed right before starting it. The
> > 
> 
> > > rational is that we want to accommodate what we discovered with
> > > the
> > 
> 
> > > prototype for the next milestone. In other words, in this email,
> > > I
> > 
> 
> > > only describe the first milestone in detail and I will give more
> > 
> 
> > > details on the next milestone shortly before we start it and so
> > > on.
> > 
> 
> > > For your information, here is the remaining of the intended
> > > roadmap
> > 
> 
> > > for the full project:
> > 
> 
> > > 4. Productization: Clean up implementation, stabilize the APIs.
> > 
> 
> > > 5. Complex legalization: Extend legalization support to
> > > everything
> > 
> 
> > > missing.
> > 
> 
> > > 6. Completeness: Fill the blanks, e.g., landing pad.
> > 
> 
> > > 7. Clean-up and performance: Add the necessary bits to be at
> > > parity
> > 
> 
> > > or beat SelectionDAG generated code.
> > 
> 
> > > 8. Transition: Document how to switch, provide tools to help.
> > 
> 

> > > ** Milestone 1 **
> > 
> 

> > > The first phase is focused on the IRTranslator pass.
> > 
> 

> > > The IRTranslator is responsible for translating the LLVM IR into
> > 
> 
> > > Generic MachineInstr. The IRTranslator pass uses some target
> > > hooks
> > 
> 
> > > to perform the ABI lowering. We can either define a new API for
> > 
> 
> > > them, e.g., ABILoweringInfo, or extend the existing
> > > TargetLowering.
> > 
> 
> > > Moreover, the prototype will focus on simple instruction, i.e.,
> > > we
> > 
> 
> > > will not support switch or landing pad for this iteration.
> > 
> 

> > > At the end of M1, the prototype will not be able to produce code,
> > 
> 
> > > since we would only have the beginning of the Global ISel
> > > pipeline.
> > 
> 
> > > Instead, we will test the IRTranslator on the generic output that
> > > is
> > 
> 
> > > produced from the tested IR.
> > 
> 

> > > * Design Decisions *
> > 
> 

> > > - The IRTranslator is a final class. Its purpose is to move away
> > > from
> > 
> 
> > > LLVM IR to MachineInstr world [final] .
> > 
> 
> > > - Lower the ABI as part of the translation process [final] .
> > 
> 

> > > * Design Questions the Prototype Addresses at the End of M1 *
> > 
> 

> > > - Handling of aggregate types during the translation.
> > 
> 
> > > - Lowering of switches.
> > 
> 
> > > - What about Module pass for Machine pass?
> > 
> 
> > > - Introduce new APIs to have a clearer separation between:
> > 
> 
> > > - Legalization (setOperationAction, etc.)
> > 
> 
> > > - Cost/Combine related (isXXXFree, etc.)
> > 
> 
> > > - Lowering related (LowerFormal, etc.)
> > 
> 
> > > - What is the contract with the backends? Is it still “should be
> > > able
> > 
> 
> > > to select any valid LLVM IR”?
> > 
> 

> > > Thanks,
> > 
> 

> > > -Quentin
> > 
> 

> > > --
> > 
> 

> > Hal Finkel
> 
> > Assistant Computational Scientist
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151130/29a519f1/attachment.html>