<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Mar 2, 2019 at 12:58 AM Fedor Sergeev <<a href="mailto:fedor.sergeev@azul.com">fedor.sergeev@azul.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<br>
<br>
<div class="gmail-m_-4863800615731490931moz-cite-prefix">On 3/2/19 2:38 AM, Hiroshi Yamauchi
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div style="font-family:arial,helvetica,sans-serif">
<div style="font-family:Arial,Helvetica,sans-serif"><span style="font-family:arial,helvetica,sans-serif">Here's a sketch of the
proposed approach for just one pass<span class="gmail_default"> (but imagine
more)</span></span></div>
<div style="font-family:Arial,Helvetica,sans-serif"><span style="font-family:arial,helvetica,sans-serif"><br>
</span></div>
<div style="font-family:Arial,Helvetica,sans-serif"><span style="font-family:arial,helvetica,sans-serif"><a href="https://reviews.llvm.org/D58845" target="_blank">https://reviews.llvm.org/D58845</a></span></div>
<div style="font-family:Arial,Helvetica,sans-serif"><br>
</div>
</div>
</div>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Mar
1, 2019 at 12:54 PM Fedor Sergeev via
llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> On 2/28/19 12:47 AM,
Hiroshi Yamauchi via llvm-dev wrote:<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hi
all,</span>
<div><br>
</div>
<div>To implement more
profile-guided optimizations, we’d
like to use ProfileSummaryInfo
(PSI) and BlockFrequencyInfo (BFI)
from more passes of various types<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">,
under the new pass manager.</span></div>
<div><br>
</div>
<div>
<div style="font-family:arial,helvetica,sans-serif">The
following is what we came up
with. Would appreciate feedback.
Thanks.</div>
<div style="font-family:arial,helvetica,sans-serif"><br>
</div>
Issue<br>
<br>
It’s not obvious (to me) how to
best do this, given that we cannot
request an outer-scope analysis
result from an inner-scope pass
through analysis managers [1] and
that we might unnecessarily
running some analyses unless we
conditionally build pass pipelines
for PGO cases.<br>
</div>
</div>
</div>
</blockquote>
Indeed, this is an intentional restriction
in new pass manager, which is more or less
a reflection of a fundamental property of
outer-inner IRUnit relationship<br>
and transformations/analyses run on those
units. The main intent for having those
inner IRUnits (e.g. Loops) is to run local
transformations and save compile time<br>
on being local to a particular small piece
of IR. Loop Pass manager allows you to run
a whole pipeline of different
transformations still locally, amplifying
the save.<br>
As soon as you run function-level analysis
from within the loop pipeline you
essentially break this pipelining.<br>
Say, as you run your loop transformation
it modifies the loop (and the function)
and potentially invalidates the analysis,<br>
so you have to rerun your analysis again
and again. Hence instead of saving on
compile time it ends up increasing it.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div style="font-family:arial,helvetica,sans-serif">Exactly.</div>
</div>
<div style="font-family:arial,helvetica,sans-serif">
<div><br>
</div>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
I have hit this issue somewhat recently
with dependency of loop passes on
BranchProbabilityInfo.<br>
(some loop passes, like IRCE can use it
for profitability analysis). </div>
</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> The only solution
that appears to be reasonable there is to
teach all the loops passes that need to be
pipelined<br>
to preserve BPI (or any other
module/function-level analyses) similar to
how they preserve DominatorTree and<br>
other "LoopStandard" analyses.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div style="font-family:arial,helvetica,sans-serif">Is
this implemented - do the loop passes
preserve BPI?</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Nope, not implemented right now.<br>
One of the problems is that even loop canonicalization passes run at
the start of loop pass manager dont preserve it<br>
(and at least LoopSimplifyCFG does change control flow).<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_quote">
<div style="font-family:arial,helvetica,sans-serif"><br>
</div>
<div>
<div style="font-family:arial,helvetica,sans-serif"><font face="arial, helvetica, sans-serif">In
buildFunctionSimplificationPipeline
(where LoopFullUnrollPass is added as in
the sketch), </font>LateLoopOptimizationsEPCallbacks
and LoopOptimizerEndEPCallbacks seem to
allow some arbitrary loop passes to be
inserted into the pipelines (via flags)?</div>
</div>
<div><br>
</div>
<div>
<div><span style="font-family:arial,helvetica,sans-serif">I
wonder <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">how
hard </span>it'd be to teach all the <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">relevant
</span>loop passes to preserve BFI<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">
(or BPI)</span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">..</span></span></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Well, each time you restructure control flow around the loops you
will have to update those extra analyses,<br>
pretty much the same way as DT is being updated through
DomTreeUpdater.<br>
The trick is to design a proper update interface (and then implement
it ;) ).<br>
And I have not spent enough time on this issue to get a good idea of
what that interface would be.<br></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hm, sounds non-trivial :) noting BFI depends on BPI.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
<br>
regards,<br>
Fedor.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div>It seems that for different
types of passes to be able to get
PSI and BFI, we’d need to ensure
PSI is cached for a non-module
pass, and PSI, BFI and the
ModuleAnalysisManager proxy are
cached for a loop pass in the pass
pipelines. This may mean
potentially needing to insert
BFI/PSI in front of many passes
[2]. It seems not obvious how to
conditionally insert BFI for PGO
pipelines because there isn’t
always a good flag to detect PGO
cases [3] or we tend to build pass
pipelines before examining the
code (or without propagating
enough info down) [4].<br>
<br>
Proposed approach<br>
<br>
- Cache PSI right after the
profile summary in the IR is
written in the pass pipeline [5].
This would avoid the need to
insert RequiredAnalysisPass for
PSI before each non-module pass
that needs it. PSI can be
technically invalidated but
unlikely. If it does, we insert
another RequiredAnalysisPass<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> <span style="font-family:Arial,Helvetica,sans-serif">[6].</span></span><br>
<br>
- Conditionally insert
RequireAnalysisPass for BFI, if
PGO, right before each loop pass
that needs it. This doesn't seem
avoidable because BFI can be
invalidated whenever the CFG
changes. We detect PGO based on
the command line flags and<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">/or</span>
whether the module has the profile
summary info (we may need to pass
the module to more functions.)<br>
<br>
- Add a new proxy
ModuleAnalysisManagerLoopProxy for
a loop pass to be able to get to
the ModuleAnalysisManager in one
step and PSI through it.<br>
<br>
Alternative approaches<br>
<br>
Dropping BFI and use PSI only<br>
We could consider not using BFI
and solely relying on PSI and
function-level profiles only (as
opposed to block-level), but
profile precision would suffer.<br>
<br>
Computing BFI in-place<br>
We could consider computing BFI
“in-place” by directly running BFI
outside of the pass manager [7].
This would let us avoid using the
analysis manager constraints but
it would still involve running an
outer-scope analysis from an
inner-scope pass and potentially
cause problems in terms of pass
pipelining and concurrency.
Moreover, a potential downside of
running analyses in-place is that
it won’t take advantage of cached
analysis results provided by the
pass manager.<br>
<br>
Adding inner-scope versions of PSI
and BFI<br>
We could consider adding a
function-level and loop-level PSI
and loop-level BFI, which
internally act like their
outer-scope versions but provide
inner-scope results only. This
way, we could always call
getResult for PSI and BFI.
However, this would still involve
running an outer-scope analysis
from an inner-scope pass.<br>
<br>
Caching the FAM and the MAM
proxies<br>
We could consider caching the
FunctionalAnalysisManager and the
ModuleAnalysisManager proxies once
early on instead of adding a new
proxy. But it seems to not likely
work well because the analysis
cache key type includes the
function or the module and some
pass may add a new function for
which the proxy wouldn’t be
cached. We’d need to write and
insert a pass in select locations
to just fill the cache. Adding the
new proxy would take care of these
with a three-line change.<br>
<br>
Conditional BFI<br>
We could consider adding a
conditional BFI analysis that is a
wrapper around BFI and computes
BFI only if profiles are available
(either checking the module has
profile summary or depend on the
PSI.) With this, we wouldn’t need
to conditionally build pass
pipelines and may work for the new
pass manager. But a similar
wouldn’t work for the old pass
manager because we cannot
conditionally depend on an
analysis under it.<br>
</div>
</div>
</div>
</blockquote>
There is LazyBlockFrequencyInfo.<br>
Not sure how well it fits this idea.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div><font face="arial, helvetica, sans-serif">Good
point. LazyBlockFrequencyInfo seems
usable with the old pass manager (save
unnecessary BFI/BPI) and would work for
function passes. I think t</font><span style="font-family:arial,helvetica,sans-serif">he </span><span style="font-family:arial,helvetica,sans-serif">restriction still applies
- </span><span style="font-family:arial,helvetica,sans-serif">a
loop pass cannot still request
(outer-scope) BFI, lazy or not, new or
old (pass manager). Another assumption
is that </span><span style="font-family:arial,helvetica,sans-serif">it'd
be cheap and safe to unconditionally
depend on PSI or check the module's
profile summary.</span></div>
</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
regards,<br>
Fedor.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div><br>
<br>
[1] We cannot call
AnalysisManager::getResult for an
outer scope but only
getCachedResult. Probably because
of potential pipelining or
concurrency issues.<br>
[2] For example, potentially
breaking up multiple pipelined
loop passes and insert
RequireAnalysisPass<BlockFrequencyAnalysis>
in front of each of them.<br>
[3] For example,
-fprofile-instr-use and
-fprofile-sample-use aren’t
present in ThinLTO post link
builds.<br>
[4] For example, we could check
whether the module has the profile
summary metadata annotated when
building pass pipelines but we
don’t always pass the module down
to the place where we build pass
pipelines.<br>
[5] By inserting
RequireAnalysisPass<ProfileSummaryInfo>
after the PGOInstrumentationUse
and the SampleProfileLoaderPass
passes (and around the
PGOIndirectCallPromotion pass for
the Thin LTO post link pipeline.)<br>
[6] For example, the
context-sensitive PGO<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">.</span><br>
[7] Directly calling its
constructor along with the
dependent analyses results, eg.
the jump threading pass.</div>
</div>
</div>
<br>
<fieldset class="gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-quote-pre">_______________________________________________
LLVM Developers mailing list
<a class="gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a class="gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote></div></div>