[llvm-dev] Asking for advice: how to best place the CoroSplit	pass
    Arthur Eubanks via llvm-dev 
    llvm-dev at lists.llvm.org
       
    Thu Jan 28 14:15:39 PST 2021
    
    
  
SCCs containing a coroutine function and a non-coroutine function should be
fairly rare right? That means you have mutually recursive functions, which
shouldn't happen a lot in typical code. Are you seeing this come up a lot?
I'm not familiar with how the legacy CGSCC pass manager schedules passes on
newly outlined SCCs, but I did notice that for the new PM, CoroSplit
doesn't enqueue any potentially split SCCs, meaning that split functions
don't have the pipeline run on them (I think... I didn't actually run and
check). I think ideally, the pipeline would be
  inliner -> function simplification -> corosplit (without skipping once)
and in corosplit you'd enqueue newly created SCCs, and also the current
SCC. The CGSCC order is so all child SCCs of the current SCC are fully
optimized, and that may not be the case if we don't rerun the pipeline on
the original SCC after running it on the newly created one. So I think
running the pipeline twice is necessary if you really want to follow the
CGSCC vision. Running the inliner and other optimizations on the original
coroutine function could provide new results now that the function itself
has been split. This would also allow the removal of the postSplitCleanup()
hack.
If you think that running the CGSCC pipeline after the split wouldn't
really further improve the coroutine function, an alternative is to use the
same pipeline above but not enqueue the newly created SCCs and the current
one, and do some ad-hoc simplification on them like postSplitCleanup(), as
opposed to running the whole CGSCC pipeline. This is fairly similar to the
current state, just changing the CoroSplit position in the pipeline.
On Thu, Jan 28, 2021 at 11:17 AM Xun Li <lxfind at gmail.com> wrote:
> Hi,
>
> The current way of scheduling CoroSplit pass is not ideal, and in some
> cases can lead to significant slowdown of the compiler.
> I would like to ask for advice/feedback on what might be the best way
> to structure the passes so that CoroSplit pass is put in the right
> place. First of all, let me explain the current situation:
>
> CoroSplit pass is a pass that splits a coroutine function into
> multiple functions, in order to implement the semantics of coroutines.
> Right now, CoroSplit pass is added in the InlinerPipeline, right after
> the Inliner pass (PassBuilder::buildInlinerPipeline). So it looks like
> this:
> - ModuleInlinerWrapperPass
>     - A few global analysis
>     - CGSCCPipeline
>         - Inliner
>         - CoroSplitPass
>         - FunctionSimplificationPipeline
>
> It's important to note that when the first time a coroutine function
> goes through CoroSplitPass, it will not be split. Instead, it will
> just add its belonging SCC back to the pipeline and hence repeat the
> entire CGSCC pipeline. The goal for doing so is to allow coroutines to
> be optimized twice, first time before splitting, so that it can have
> smaller frame size, and second time after splitting, so that the
> splitted functions can get cleaned up.
> Furthermore, CoroSplitPass needs to be in CGSCC pipeline in order to
> enable coroutine elision (an important coroutine optimization to
> reduce dynamic memory allocation), that is, we want to walk through
> SCCs postorder so that functions can inline fully processed coroutines
> and does CoroElision.
> A side-effect of this structure, however, is that all the
> non-coroutines that are in the same SCC of any coroutine will run
> through Inliner twice. This can lead to unexpected slowdown if an
> aggressive inlining threshold is used and generates unexpectedly large
> functions. There is also not much benefit for coroutines to run
> through Inliner pass twice either.
>
> So the ideal goal looks like this:
> For coroutine functions, run through:
>   Inliner -> function simplification passes -> CoroSplit -> function
> simplification passes
> For non-coroutine functions, just run through as usual:
>   Inliner -> function simplification passes.
>
> The challenge is of course when they are in the same SCC it's hard to
> separate them like this. Any thoughts on how we might be able to
> achieve that?
>
> --
> Xun
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210128/6155056a/attachment.html>
    
    
More information about the llvm-dev
mailing list