[llvm-dev] Reviews needed for LazyCallGraph patches (and coroutines)
Brian Gesiak via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 21 15:17:24 PST 2020
Hey all,
Thanks to reviews by Wenlei He and JunMa (junparser), I have a stack
of patches that implement the C++20 coroutine transformations in the
new LLVM pass manager. The stack builds and optimizes coroutines in
major open source coroutine libraries like
http://github.com/lewissbaker/cppcoro. I also use the patches in my
employer's downstream fork of LLVM/Clang, and it successfully compiles
large C++ codebases that make use of C++17 and coroutines.
The coroutine patches exist as a stack of 6 Phabricator revisions [1].
Crucially, they make use of 3 patches related to LazyCallGraph, the
call graph representation used by the new pass manager:
1. https://reviews.llvm.org/D72025 (by Johannes Doerfert)
2. https://reviews.llvm.org/D70927 (ditto)
3. https://reviews.llvm.org/D72226 (this one by me)
I only have moderate experience using the new pass manager, so I'd
greatly appreciate a review of these patches. My coroutine patches
rely on the interfaces they add to the LazyCallGraph abstraction.
Could someone chime in with whether those interfaces could be merged
to trunk?
- Brian Gesiak
[1] Sequentially, https://reviews.llvm.org/D71898 through
https://reviews.llvm.org/D71903. Reviews are welcome here as well!
On Thu, Dec 26, 2019 at 9:29 AM Brian Gesiak <modocache at gmail.com> wrote:
>
> Hello all,
>
> It's been a month since my previous email on the topic, and since then
> I've done some initial work on porting the coroutines passes to the
> new pass manager. In total there are 6 patches -- that's a lot to
> review, so allow me to introduce the changes being made in each of
> them.
>
> # What's finished
>
> In these first 6 patches, I focused on lowering coroutine intrinsics
> correctly. With the patches applied, Clang is able to successfully use
> the new pass manager to build and test a major open source C++20
> coroutines library, https://github.com/lewissbaker/cppcoro.
>
> 1. https://reviews.llvm.org/D71898
> New pass manager implementation of the coro-early function pass, with
> LLVM regression tests for this pass updated to test both the new and
> legacy implementations.
>
> 2. https://reviews.llvm.org/D71899
> Same thing, but for coro-split CGSCC pass. This patch adds support to
> the new pass manager for only the C++20 switch-based coroutines ABI.
> I'd like to implement support for the Swift returned-continuation ABI
> in a future patch.
>
> 3. https://reviews.llvm.org/D71900
> 4. https://reviews.llvm.org/D71901
> Same thing, but for the coro-elide and coro-cleanup function passes.
>
> 5. https://reviews.llvm.org/D71902
> The first 4 patches allow users to run coroutine passes by invoking,
> for example 'opt -passes=coro-early'. However, most of LLVM's tests
> for coroutines use 'opt -enable-coroutines', which adds all 4
> coroutines passes, in the correct order, to the pass pipeline. This
> 5th patch adds a similar feature but in a way that can be used by the
> new pass manager: a pipeline parser that understands 'opt
> -passes=coroutines'.
>
> 6. https://reviews.llvm.org/D71903
> Finally, this patch modifies Clang to run the new coroutine passes
> when the experimental pass manager is being used with coroutines
> enabled (either via '-fcoroutines-ts' or '-std=c++2a'). With all 6
> patches applied, the cppcoro library builds and tests successfully
> with a Clang built with 'ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On'.
>
> # What's yet to be done: Swift "returned-continuation" coroutines &
> the "devirtualization trigger"
>
> Two things are missing from these initial 6 patches: the first is, as
> I mentioned above, support for returned-continuation coroutines, which
> are used by Swift. I think this will be fairly straightforward for me
> (or others, if interested) to add in an upcoming patch.
>
> The second missing feature has to do with how CGSCC passes are re-run
> by the pass manager infrastructure.
>
> The legacy coro-split and coro-elide passes work in tandem: the
> coro-split pass first introduces an indirect call to a dummy
> "devirtualization trigger" function, and then the coro-elide pass
> devirtualizes it. The legacy CGSCC pass manager checks for
> devirtualizations after each pass and, if any occur, it re-runs the
> CGSCC pass pipeline. In other words: coro-split is run, a check is
> made for devirtualization, then coro-elide is run, and another check
> is made.
>
> The new pass manager allows for a series of CGSCC passes to be wrapped
> in a DevirtSCCRepeatedPass, but this only checks for devirtualizations
> after all passes are run. In the case of coro-split and coro-elide,
> the indirect function call is added and then devirtualized within a
> single pass of the repeater, so DevirtSCCRepeatedPass never sees the
> devirtualization and thus doesn't perform a repeat iteration. In other
> words: a check is made, coro-split is run and it adds an indirect
> call, coro-elide is run and it devirtualizes that call, and then
> another check is made. From the repeater pass's point of view, nothing
> has changed.
>
> This is something I'd like to tweak in future patches (my thinking is
> to add a member to CGSCCUpdateResult to allow passes to manually
> inform the pass manager about devirtualizations, but I'm very open to
> alternative ideas). But for now, I simply have Clang manually schedule
> two iterations of the coro-split pass, rather than have it rely on the
> repeater detecting a devirtualization and automatically scheduling
> another coro-split iteration. As a result, the new pass manager
> implementation of coroutines realizes correct program behavior, but
> fails to realize some of the optimization patterns that are tested for
> in the regression tests in 'llvm/test/Transforms/Coroutines'.
>
> I'd greatly appreciate code review on my patches! Or, please reply
> here with any questions/comments.
>
> - Brian Gesiak
>
> On Mon, Nov 25, 2019 at 8:39 PM Brian Gesiak <modocache at gmail.com> wrote:
> >
> > Hi all!
> >
> > I'm working on porting the LLVM passes for C++20 coroutines over to
> > the new pass manager infrastructure. Of the 4 passes, 3 are function
> > passes, and so porting them is straightforward and easy (my thanks to
> > those involved -- the design is great!). However, I'm struggling with
> > 'coro-split', which is an SCC pass. Specifically, I'd like advice on
> > how to appropriately update the new pass manager's preferred
> > representation of the SCC: 'llvm::LazyCallGraph'.
> >
> > Before I ask my specific questions about 'LazyCallGraph', it may help
> > to explain my understanding of the 'coro-split' pass and how it
> > modifies the call graph:
> >
> > The coro-split pass "clones" coroutine functions. For example, for a
> > coroutine function 'foo', it creates declarations for 3 new functions
> > ('foo.resume', 'foo.destroy', and 'foo.cleanup') using the static
> > member function 'llvm::Function::Create'. It then uses
> > 'llvm::CloneFunctionInto' to copy the 'foo' function's attributes,
> > arguments, basic blocks, and instructions, into each of the 3 new
> > functions. Finally, the coro-split pass replaces the entry basic
> > blocks of the function to read a value from a global store of
> > coroutine state called the coroutine frame, and uses that value to
> > determine which part of the cloned coroutine function should be
> > executed upon "resumption" of the coroutine.
> >
> > Of course, once the coro-split SCC pass has done all this, it must
> > update LLVM's representation of the call graph. It does so using ~40
> > lines of code that you may read here:
> > https://github.com/llvm/llvm-project/blob/890c6ef1fb/llvm/lib/Transforms/Coroutines/Coroutines.cpp#L186-L224
> >
> > To explain my understanding of the code in the above link: the
> > coro-split pass completely re-initializes the 'CallGraphSCC' object,
> > using the member function
> > 'CallGraphSCC::initialize(ArrayRef<CallGraphNode*>)'. The array of
> > nodes it uses to re-initialize the SCC includes nodes for each of the
> > 3 new functions, which it adds to the call graph using
> > 'CallGraph::getOrInsertFunction'. For each of the new nodes, and for
> > the node representing the original function, the coro-split pass
> > iterates over each of the instructions in the function, and uses
> > 'CallGraphNode::addCalledFunction' to add edges to the call graph.
> >
> > The difficulty I'm having here in porting coro-split to the new pass
> > manager is that its SCC passes use 'LazyCallGraph' and
> > 'LazyCallGraph::SCC'. These classes' documentation explains that they
> > are designed with the constraint that optimization passes shall not
> > delete, remove, or add edges that invalidate a bottom-up traversal of
> > the SCC DAG. Unfortunately, I understand the coro-split pass to be
> > doing exactly those things.
> >
> > As a result, if I attempt to mimic the coro-split pass's logic by
> > inserting functions into the call graph using 'LazyCallGraph::get',
> > and then adding call edges with
> > 'LazyCallGraph::RefSCC::insertTrivialRefEdge' and
> > 'LazyCallGraph::RefSCC::switchInternalEdgeToCall', I'm met with an
> > assertion: llvm/lib/Analysis/CGSCCPassManager.cpp:463: [...]:
> > Assertion `E && "No function transformations should introduce *new* "
> > "call edges! Any new calls should be modeled as " "promoted existing
> > ref edges!"' failed.
> >
> > Does anyone have any suggestions on how I can better work within the
> > constraints of the 'LazyCallGraph'?
> >
> > In case it helps, you can see the code that currently hits this
> > assertion here:
> > https://github.com/modocache/llvm-project/commit/02c10528e9. Of
> > particular interest may be the functions 'buildLazyCallGraphNode' and
> > 'updateCallGraphAfterSplit'.
> >
> > One idea a colleague of mine suggested was to have an earlier
> > coroutine function pass insert declarations of 'foo.resume',
> > 'foo.destroy', and 'foo.cleanup', which we could then promote to call
> > edges. However, they also found this FIXME in CGSCCPassManager.cpp:
> > "We should really handle adding new calls. While it will make
> > downstream usage more complex, there is no fundamental limitation and
> > it will allow passes within the CGSCC to be a bit more flexible in
> > what transforms they can do. Until then, we verify that new calls
> > haven't been introduced."
> >
> > As a result, I'm now unsure whether I ought to modify coro-split's
> > implementation for the new pass manager, or modify 'CGSCCPassManager'
> > to allow for the insertion of new calls. Any and all advice would be
> > greatly appreciated!
> >
> > - Brian Gesiak
More information about the llvm-dev
mailing list