[LLVMdev] Scheduler Roadmap

Fri May 11 11:28:25 PDT 2012

Dave, 

  Thank you for your interest. Please see my replies below. Sorry that my
terminology is not as crisp as Andy's, but I think you can see what I mean.

Sergei

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.

> -----Original Message-----
> From: dag at cray.com [mailto:dag at cray.com]
> Sent: Friday, May 11, 2012 12:14 PM
> To: Sergei Larin
> Cc: 'Andrew Trick'; 'Hal Finkel'; wrf at cray.com; 'LLVM Developers
> Mailing List'
> Subject: Re: [LLVMdev] Scheduler Roadmap
> 
> Sergei Larin <slarin at codeaurora.org> writes:
> 
> >   - We do need to have a way to assign bundles much earlier than we
> do now.
> 
> Yeah, I can imagine why this would be useful.
> 
> > And it needs to be intertwined with scheduling (Bundler currently
> > reuses a good chunk of scheduler infrastructure).
> 
> Just to clarify, is the need due to the current bundling implementation
> of reusing scheduler infrastructure or is there a more fundamental
> reason the two should be tied together?  I can imagine some advantages
> of fusing the two but I'm no VLIW expert.

[Larin, Sergei] A little bit of both. Current bundler uses DAG dep builder
to facilitate its analysis, for that it actually instantiates full MI
scheduler... without scheduling.  Ideally scheduler itself should be able to
produce bundled code (since it has the best picture of machine resources and
instruction stream), but standalone bundler by itself might be needed to
re-bundle incrementally (which it does not do right now). In short - I see
bundler as a utility, not as a pass.

> 
> > It is also obvious that it will have adverse effect on all the
> > downstream passes.
> 
> How so?  Isn't the bundle representation supposed to be fairly
> transparent to passes that don't care about it?

[Larin, Sergei] Kind of. Once bundles are finalized, bundle header become a
new "super instructions", and if a pass does not need to look at individual
(MI) instructions, there will not be any difference for it. But if a pass
need to deal with individual MIs, things get interesting. For one, we lack
API for moving/adding/removing individual MIs to/from finalized bundles. We
also lack API to move MIs between BBs in presence of bundles. Live Intervals
obviously do not work with bundles... 
  Two, semantics (dependencies) within a bundle are parallel (think { r0 =
r1; r1 = r0 } in serial vs. parallel semantics) and if a pass needs to
"understand" it, it will need to be "taught" how to do it. This is where
incremental rebundling might come in handy. Fortunately we currently do
bundling fairly late, so it is not an issue yet.

> 
> > It is further insulting due to the fact that bundling is trivial to
> do
> > during scheduling, but it goes hard against the original assumptions
> > made elsewhere.
> 
> Can you explain more about this?

[Larin, Sergei] The core of bundler is the DFA state machine, which also
_must_ be a part of any VLIW scheduler, so in my Hexagon VLIW "custom"
scheduler (I actually have two - SDNode and MI based) I virtually __create__
bundles, but discard them at the end of pass, only to recreate them again
later in the standalone bundler. Second attempt is riddled with additional
false dependencies (anti, output etc.) introduced by the register
allocation, so bundling quality is affected.

> 
> > Re-bundling is also a distinct task that might need to be addressed
> in
> > this context.

[Larin, Sergei] I think above explanation covers this.

> 
> Rebundling is certainly useful.  I'm not sure what you mean by "in this
> context."
> 
> >   - We do need to have at least a distant plan for global scheduling.
> 
> Yes, definitely!
> 
> > BB scope is nice and manageable, but I can easily get several percent
> > of "missed" performance by simply implementing a pull-up pass at the
> > very end of code generation... meaning multiple opportunities were
> > lost earlier.  Current way to express and maintain state needed for
> > global scheduling remains to be improved.
> 
> We could also use a global scheduler in the medium-term though it's not
> absolutely critical.  A nice clean infrastructure to support global
> scheduling with multiple heuristics, etc. would be very valuable.  It
> would also be a lot of work.  :)

[Larin, Sergei] I will need to start doing this very soon to meet my
performance goals, but extensive discussion and generic support might take
some time to crystallize... but indeed critically needed.

> 
> >   - SW pipelining and scheduler interaction with it. When (not if:)
> we
> > will have a robust SW pipeliner it will likely to take place before
> > first scheduling pass, and we do not want to "undo" some decision
> made there.
> 
> Right.  We (the LLVM community) will need ways to mark regions "don't
> schedule."
> 
>                               -Dave