[LLVMdev] design question on inlining through statepoints and patchpoints

Philip Reames listmail at philipreames.com
Fri Jun 26 11:15:58 PDT 2015


Chandler,

Thanks for asking about the bigger picture.  It's definitely good to 
summarize that somewhere more visible than a patch review.

In my mind, there are two intersecting but mostly orthogonal reasons why 
we want to support the ability to inline through statepoints (and 
someday possibly patchpoints).  They are: a) supporting early safepoint 
insertion and b) implementation flexibility.  I'll go into each below.


Today, we really only support the insertion of safepoints (both polls 
and call safepoints) after inlining has been done.  This is the model 
that PlaceSafepoints and RewriteStatepointsForGC support in tree and 
that we've been using with some success.  I believe that's also the 
model that WebKit has used with patchpoints, though in a different 
form.  Their high level optimization is done in an entirely different 
IR; whereas ours is done over LLVM IR.

There have been a couple of things that have come up which make me think 
it's valuable to support the insertion of at least call safepoints 
earlier in the optimizer.  For one, it makes thinking about escaped 
stack allocated objects (which must be updated in place) more obvious 
for the frontend.  Second, talking to other frontend authors most people 
expect to use an early insertion model.  Supporting both from a 
functional standpoint and having late insertion be an "optimization" 
seems to provide a more gentle introduction path.  Third, it makes 
reasoning about deoptimization a lot cleaner.

I don't want to get into the weeds of deoptimization right now - it 
would utterly derail the conversation and we're not making any proposals 
for upstream in this area at the moment - but let me summarize the 
problem for context.  Essentially, we need to be able to track a chain 
of abstract frame states through inlining and serialize them into the 
stackmap via a statepoint.  Today, we handle tracking those abstract 
frames using an independent mechanism I'm going to skip over.  This has 
worked, but has frankly been a major pain point.  Having the ability to 
inline through statepoints opens up some possible designs in this space, 
but we're not proposing anything concrete at this moment.


The more immediate benefit from adding support for inlining through 
statepoints is that it gives a lot more implementation flexibility. 
Today, we have a hard requirement that *all* inlining be done before 
statepoints are inserted.  This influences design choices in sometimes 
odd ways.

As one in tree example, consider the insertion of safepoint polls done 
by PlaceSafepoints.  We currently need to force inline newly inserted 
calls rather than just inserting them and worrying about the 
profitability later.  In the future, we might want to support not 
inlining some of these poll sites.  (Consider a really cold loop - you 
might prefer the cost of the extra call over the code size increase for 
the test and slow path.)  Today, we'd *have* to make that choice within 
PlaceSafepoints.  With support for inlining through statepoints, we 
could defer that to a distinct pass and thus factor the code more 
cleanly.  In fact, we could even separate some of the existing 
optimizations for placement into a separate pass which removes provably 
redundant safepoints rather than combining that into the insertion logic 
itself.  I'm not necessarily saying we will, but the ability to inline 
through statepoints gives us the option of considering choices like 
these which we can't otherwise do.

Another (out of tree) example comes from our optimization pipeline. We 
use a set of known functions to represent high level concepts. Today, we 
have a hard requirement about inlining all of these known functions 
(well, all that can contain safepoints), before safepoint insertion.  
Part of our work (out of tree) at safepoint insertion is pruning 
deoptimization state which turn out not to be required. This often opens 
up optimization possibilities that would be visible if we could identify 
high level semantic constructs.   Being able to swap the safepoint 
insertion and lowering phases (which does inlining internally) would be 
very helpful from an optimization perspective.


Philip

p.s. Just to be clear, the flexibility parts really only require the 
InlineFunction changes; not full support from the inliner.  That's only 
required for early call safepoint insertion.

p.p.s. Longer term, it really feels to me that statepoints/patchpoints 
are moving in the direction of being an arbitrary wrapped call.  There's 
some underlying call (or invoke) with additional layers of semantics 
wrapped around that.  We seem to be settling on a relatively fixed set 
of composable semantics, so long term these may be IR extensions.  No 
concrete proposals here yet though.



On 06/26/2015 12:59 AM, Chandler Carruth wrote:
> I'm digging into the patches, and it has at least raised one 
> high-level question for me that I wanted to ask here so that the 
> response would be reasonably widely seen.
>
> Essentially, what are the particular motivations for inlining through 
> statepoints? How important are they for the actual users we're 
> building of the GC infrastructure? (Philip's email starts to give some 
> hints here, but I'm not really clear exactly how important this is... 
> for example, I don't understand what the deoptimization thing is all 
> about. This is likely just my ignorance of common parlance and 
> problems in GC land, but I think it would be useful to break it down 
> so that we have a reference for what all is being discussed.)
>
> I'm not really doubting the importance mind you, I'd just like to 
> *understand* it better (and confirm my suspicion that this is really a 
> "must have" feature rather than a "nice to have" feature).
>
> On Tue, Jun 23, 2015 at 2:24 PM Sanjoy Das 
> <sanjoy at playingwithpointers.com 
> <mailto:sanjoy at playingwithpointers.com>> wrote:
>
>     patches here (reverse chronological order):
>
>     http://reviews.llvm.org/D10633
>     http://reviews.llvm.org/D10632
>     http://reviews.llvm.org/D10631
>
>
>     On Wed, Jun 17, 2015 at 4:33 PM, Philip Reames
>     <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>     > The long term plan is a) evolving, and b) dependent on the
>     specific use
>     > case.  :)
>     >
>     > It would definitely be nice if we could support both early and late
>     > safepoint insertion.  I see no reason that LLVM as a project
>     should pick one
>     > or the other since the infrastructure required is largely
>     overlapping.
>     > (Obviously, I'm going to be mostly working on the parts that I
>     need, but
>     > others are always welcome to extend in other directions.)
>     >
>     > One of the challenges we've run into is that supporting
>     deoptimization
>     > points (which in practice are safepoints) require a lot of the same
>     > infrastructure as early safepoint insertion.  It's likely that
>     we'll end
>     > with a scheme which inserts safepoint polls quite early (but
>     with restricted
>     > semantics and optimization impact) and then converts them to
>     explicit GC
>     > safepoints (with full invalidation semantics) quite late.  We
>     already have
>     > this distinction in tree in the form of PlaceSafepoints and
>     > RewriteStatepointsForGC.  I suspect we'll move further in this
>     direction.
>     >
>     > I suspect that for languages without deoptimization, you'll want
>     to insert
>     > safepoint polls quite late.  Whether you do the same for
>     safepoints-at-calls
>     > is debatable.  I used to think that you should do that quite
>     late, but I'm
>     > no longer sure that's always the right answer.
>     >
>     > Philip
>     >
>     >
>     >
>     > On 06/17/2015 04:13 PM, Swaroop Sridhar wrote:
>     >>
>     >> With respect to phase ordering, is the long term plan to run the
>     >> statepoint placement/transformation phases late (after all
>     optimizations)?
>     >> If so, will we need to support inlining post statepoint
>     transformation?
>     >>
>     >> Thanks,
>     >> Swaroop.
>     >>
>     >> -----Original Message-----
>     >> From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com
>     <mailto:sanjoy at playingwithpointers.com>]
>     >> Sent: Tuesday, June 16, 2015 7:20 PM
>     >> To: LLVM Developers Mailing List; Andrew Trick; Swaroop
>     Sridhar; Chandler
>     >> Carruth; Nick Lewycky
>     >> Subject: design question on inlining through statepoints and
>     patchpoints
>     >>
>     >> I've been looking at inlining invokes / calls done through
>     statepoints and
>     >> I want to have a design discussion before I sink too much time into
>     >> something I'll have to throw away.  I'm not actively working on
>     adding
>     >> inlining support to patchpoints, but I suspect these issues are
>     applicable
>     >> towards teaching LLVM to inline through patchpoints as well.
>     >>
>     >>
>     >> There are two distinct problems to solve before LLVM can inline
>     through
>     >> statepoints:
>     >>
>     >> # Managing data flow for the extra metadata args.
>     >>
>     >> LLVM needs some logic to "transfer" the extra live values
>     attached to a
>     >> statepoint/patchpoint into the body of the inlinee somehow. 
>     How this is
>     >> handled depends on the semantics of the live values (something
>     the frontend
>     >> knows).  There needs to be a clean way for the frontend to
>     communicate this
>     >> information to LLVM, or we need to devise a convention that is
>     sensible for
>     >> the kinds of frontends we wish to support.  Initially I plan to
>     sidestep
>     >> this problem by only inlining through statepoints have *no*
>     extra live
>     >> values / gc pointers.
>     >>
>     >>
>     >> # Managing the call graph
>     >>
>     >> This is the problem we need to solve first. Currently LLVM
>     views the a
>     >> statepoint or patchpoint call as
>     >>
>     >>    1. A call to an intrisic.  This does not add an edge to the call
>     >>       graph (not even to the dedicated external node).
>     >>
>     >>    2. An escaping use of the callee.
>     >>
>     >> IIUC, (2) is (conservatively) imprecise and (1) is incorrect. 
>     (1) makes
>     >> LLVM believe that a function that calls @f via a statepoint
>     does not call @f
>     >> at all.  (2) makes LLVM believe that @f is visible externally,
>     even if it
>     >> has internal linkage.
>     >>
>     >> Given this starting point, I can think of three ways to model
>     statepoint's
>     >> (and patchpoint's) control flow semantics within a call
>     >> graph:
>     >>
>     >>    1. Model calls to statepoint, patchpoint and stackmap
>     intrinsics as
>     >>       calling the external node.  Teach the inliner pass to
>     >>       "devirtualize" calls through statepoints when posssible,
>     except
>     >>       that the "devirtualization" is only a facade (i.e. we don't
>     >>       mutate the IR to change the statepoint to a direct
>     call).  We add
>     >>       some abstraction to the inlining utility functions to inline
>     >>       through something more general than a CallSite.
>     >>
>     >>    2. Introduce a new abstraction InlineSite (bikeshedding on
>     the name
>     >>       is welcome).  InlineSite sits on top of a CallSite and
>     knows how
>     >>       to extract the semantic call out of a statepoint or a
>     patchpoint
>     >>       (similar to the llvm::Statepoint class).  The inliner and the
>     >>       call graph analysis works on top of this InlineSite
>     abstraction
>     >>       instead of the CallSite abstraction.
>     >>
>     >>    3. Change all the places that matter (CallGraph,
>     CallGraphSCCPass
>     >>       etc.) from
>     >>
>     >>         if (CallSite CS = ...)
>     >>
>     >>       to
>     >>
>     >>         if (Statepoint SP = ...)
>     >>            ...
>     >>         else if (CallSite CS = ...)
>     >>
>     >>       or something equivalent to this.
>     >>
>     >> Personally, I'd prefer going with (1) if it is viable, and (2)
>     if not.
>     >>
>     >> What do you think?
>     >>
>     >> -- Sanjoy
>     >>
>     >> _______________________________________________
>     >> LLVM Developers mailing list
>     >> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>     http://llvm.cs.uiuc.edu
>     >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>     >
>     >
>     _______________________________________________
>     LLVM Developers mailing list
>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>     http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150626/883ac474/attachment.html>


More information about the llvm-dev mailing list