<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 3/4/19 10:49 PM, Hiroshi Yamauchi
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAASM7NKbDicFKisqqw+CAQoisXtT3sR-f4CtmuPdh+G=5Xn9dg@mail.gmail.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif"><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Mar 4, 2019 at 10:55
AM Hiroshi Yamauchi <<a href="mailto:yamauchi@google.com"
moz-do-not-send="true">yamauchi@google.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div dir="ltr">
<div style="font-family:arial,helvetica,sans-serif"><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sat, Mar 2, 2019 at
12:58 AM Fedor Sergeev <<a
href="mailto:fedor.sergeev@azul.com" target="_blank"
moz-do-not-send="true">fedor.sergeev@azul.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
<br>
<div
class="gmail-m_-7727846899844856475gmail-m_-4863800615731490931moz-cite-prefix">On
3/2/19 2:38 AM, Hiroshi Yamauchi wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div
style="font-family:arial,helvetica,sans-serif">
<div
style="font-family:Arial,Helvetica,sans-serif"><span
style="font-family:arial,helvetica,sans-serif">Here's a sketch of the
proposed approach for
just one pass<span
class="gmail_default">
(but imagine more)</span></span></div>
<div
style="font-family:Arial,Helvetica,sans-serif"><span
style="font-family:arial,helvetica,sans-serif"><br>
</span></div>
<div
style="font-family:Arial,Helvetica,sans-serif"><span
style="font-family:arial,helvetica,sans-serif"><a
href="https://reviews.llvm.org/D58845"
target="_blank"
moz-do-not-send="true">https://reviews.llvm.org/D58845</a></span></div>
<div
style="font-family:Arial,Helvetica,sans-serif"><br>
</div>
</div>
</div>
<div class="gmail_quote">
<div dir="ltr"
class="gmail_attr">On Fri,
Mar 1, 2019 at 12:54 PM
Fedor Sergeev via llvm-dev
<<a
href="mailto:llvm-dev@lists.llvm.org"
target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> On
2/28/19 12:47 AM, Hiroshi
Yamauchi via llvm-dev
wrote:<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr"><span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">Hi all,</span>
<div><br>
</div>
<div>To implement
more
profile-guided
optimizations,
we’d like to use
ProfileSummaryInfo
(PSI) and
BlockFrequencyInfo
(BFI) from more
passes of various
types<span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">, under the new pass
manager.</span></div>
<div><br>
</div>
<div>
<div
style="font-family:arial,helvetica,sans-serif">The
following is
what we came up
with. Would
appreciate
feedback.
Thanks.</div>
<div
style="font-family:arial,helvetica,sans-serif"><br>
</div>
Issue<br>
<br>
It’s not obvious
(to me) how to
best do this,
given that we
cannot request an
outer-scope
analysis result
from an
inner-scope pass
through analysis
managers [1] and
that we might
unnecessarily
running some
analyses unless we
conditionally
build pass
pipelines for PGO
cases.<br>
</div>
</div>
</div>
</blockquote>
Indeed, this is an
intentional restriction in
new pass manager, which is
more or less a reflection
of a fundamental property
of outer-inner IRUnit
relationship<br>
and
transformations/analyses
run on those units. The
main intent for having
those inner IRUnits (e.g.
Loops) is to run local
transformations and save
compile time<br>
on being local to a
particular small piece of
IR. Loop Pass manager
allows you to run a whole
pipeline of different
transformations still
locally, amplifying the
save.<br>
As soon as you run
function-level analysis
from within the loop
pipeline you essentially
break this pipelining.<br>
Say, as you run your loop
transformation it modifies
the loop (and the
function) and potentially
invalidates the analysis,<br>
so you have to rerun your
analysis again and again.
Hence instead of saving on
compile time it ends up
increasing it.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div
style="font-family:arial,helvetica,sans-serif">Exactly.</div>
</div>
<div
style="font-family:arial,helvetica,sans-serif">
<div><br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
I have hit this issue
somewhat recently with
dependency of loop passes
on BranchProbabilityInfo.<br>
(some loop passes, like
IRCE can use it for
profitability analysis). </div>
</blockquote>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> The
only solution that appears
to be reasonable there is
to teach all the loops
passes that need to be
pipelined<br>
to preserve BPI (or any
other
module/function-level
analyses) similar to how
they preserve
DominatorTree and<br>
other "LoopStandard"
analyses.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div
style="font-family:arial,helvetica,sans-serif">Is
this implemented - do the
loop passes preserve BPI?</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Nope, not implemented right now.<br>
One of the problems is that even loop
canonicalization passes run at the start of loop
pass manager dont preserve it<br>
(and at least LoopSimplifyCFG does change control
flow).<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_quote">
<div
style="font-family:arial,helvetica,sans-serif"><br>
</div>
<div>
<div
style="font-family:arial,helvetica,sans-serif"><font
face="arial, helvetica,
sans-serif">In
buildFunctionSimplificationPipeline
(where LoopFullUnrollPass is added as in the sketch), </font>LateLoopOptimizationsEPCallbacks
and LoopOptimizerEndEPCallbacks seem to allow some arbitrary loop passes
to be inserted into the
pipelines (via flags)?</div>
</div>
<div><br>
</div>
<div>
<div><span
style="font-family:arial,helvetica,sans-serif">I
wonder <span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">how hard </span>it'd be
to teach all the <span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">relevant </span>loop
passes to preserve BFI<span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif"> (or BPI)</span><span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif"></span><span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif"></span><span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif"></span><span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif"></span><span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">..</span></span></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Well, each time you restructure control flow around
the loops you will have to update those extra
analyses,<br>
pretty much the same way as DT is being updated
through DomTreeUpdater.<br>
The trick is to design a proper update interface
(and then implement it ;) ).<br>
And I have not spent enough time on this issue to
get a good idea of what that interface would be.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div style="font-family:arial,helvetica,sans-serif">Hm,
sounds non-trivial :) noting BFI depends on BPI.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif">To step
back, it looks like:</div>
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif">want to use
profiles from more passes -> need to get BFI (from loop
passes) -> need all the loop passes to preserve BFI.</div>
</div>
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif"><br>
</div>
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif">I wonder if
there's no way around this.</div>
</div>
</div>
</blockquote>
Indeed. I believe this is a general consensus here.<br>
<br>
regards,<br>
Fedor.<br>
<br>
<blockquote type="cite"
cite="mid:CAASM7NKbDicFKisqqw+CAQoisXtT3sR-f4CtmuPdh+G=5Xn9dg@mail.gmail.com">
<div dir="ltr">
<div class="gmail_quote">
<div class="gmail_default"
style="font-family:arial,helvetica,sans-serif"><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_quote">
<div>
<div style="font-family:arial,helvetica,sans-serif"><br>
</div>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
regards,<br>
Fedor.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_quote">
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div>It seems that
for different
types of passes to
be able to get PSI
and BFI, we’d need
to ensure PSI is
cached for a
non-module pass,
and PSI, BFI and
the
ModuleAnalysisManager
proxy are cached
for a loop pass in
the pass
pipelines. This
may mean
potentially
needing to insert
BFI/PSI in front
of many passes
[2]. It seems not
obvious how to
conditionally
insert BFI for PGO
pipelines because
there isn’t always
a good flag to
detect PGO cases
[3] or we tend to
build pass
pipelines before
examining the code
(or without
propagating enough
info down) [4].<br>
<br>
Proposed approach<br>
<br>
- Cache PSI right
after the profile
summary in the IR
is written in the
pass pipeline [5].
This would avoid
the need to insert
RequiredAnalysisPass for PSI before each non-module pass that needs it.
PSI can be
technically
invalidated but
unlikely. If it
does, we insert
another
RequiredAnalysisPass<span
class="gmail_default" style="font-family:arial,helvetica,sans-serif"> <span
style="font-family:Arial,Helvetica,sans-serif">[6].</span></span><br>
<br>
- Conditionally
insert
RequireAnalysisPass
for BFI, if PGO,
right before each
loop pass that
needs it. This
doesn't seem
avoidable because
BFI can be
invalidated
whenever the CFG
changes. We detect
PGO based on the
command line flags
and<span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">/or</span> whether the
module has the
profile summary
info (we may need
to pass the module
to more
functions.)<br>
<br>
- Add a new proxy
ModuleAnalysisManagerLoopProxy for a loop pass to be able to get to the
ModuleAnalysisManager in one step and PSI through it.<br>
<br>
Alternative
approaches<br>
<br>
Dropping BFI and
use PSI only<br>
We could consider
not using BFI and
solely relying on
PSI and
function-level
profiles only (as
opposed to
block-level), but
profile precision
would suffer.<br>
<br>
Computing BFI
in-place<br>
We could consider
computing BFI
“in-place” by
directly running
BFI outside of the
pass manager [7].
This would let us
avoid using the
analysis manager
constraints but it
would still
involve running an
outer-scope
analysis from an
inner-scope pass
and potentially
cause problems in
terms of pass
pipelining and
concurrency.
Moreover, a
potential downside
of running
analyses in-place
is that it won’t
take advantage of
cached analysis
results provided
by the pass
manager.<br>
<br>
Adding inner-scope
versions of PSI
and BFI<br>
We could consider
adding a
function-level and
loop-level PSI and
loop-level BFI,
which internally
act like their
outer-scope
versions but
provide
inner-scope
results only. This
way, we could
always call
getResult for PSI
and BFI. However,
this would still
involve running an
outer-scope
analysis from an
inner-scope pass.<br>
<br>
Caching the FAM
and the MAM
proxies<br>
We could consider
caching the
FunctionalAnalysisManager
and the
ModuleAnalysisManager
proxies once early
on instead of
adding a new
proxy. But it
seems to not
likely work well
because the
analysis cache key
type includes the
function or the
module and some
pass may add a new
function for which
the proxy wouldn’t
be cached. We’d
need to write and
insert a pass in
select locations
to just fill the
cache. Adding the
new proxy would
take care of these
with a three-line
change.<br>
<br>
Conditional BFI<br>
We could consider
adding a
conditional BFI
analysis that is a
wrapper around BFI
and computes BFI
only if profiles
are available
(either checking
the module has
profile summary or
depend on the
PSI.) With this,
we wouldn’t need
to conditionally
build pass
pipelines and may
work for the new
pass manager. But
a similar wouldn’t
work for the old
pass manager
because we cannot
conditionally
depend on an
analysis under it.<br>
</div>
</div>
</div>
</blockquote>
There is
LazyBlockFrequencyInfo.<br>
Not sure how well it fits
this idea.<br>
</div>
</blockquote>
<div><br>
</div>
<div>
<div><font face="arial,
helvetica, sans-serif">Good
point.
LazyBlockFrequencyInfo
seems usable with the
old pass manager (save
unnecessary BFI/BPI) and
would work for function
passes. I think t</font><span
style="font-family:arial,helvetica,sans-serif">he </span><span
style="font-family:arial,helvetica,sans-serif">restriction
still applies - </span><span
style="font-family:arial,helvetica,sans-serif">a loop pass cannot still
request (outer-scope)
BFI, lazy or not, new or
old (pass manager).
Another assumption is
that </span><span
style="font-family:arial,helvetica,sans-serif">it'd
be cheap and safe to
unconditionally depend
on PSI or check the
module's profile
summary.</span></div>
</div>
<div><br>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> <br>
regards,<br>
Fedor.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div><br>
<br>
[1] We cannot call
AnalysisManager::getResult for an outer scope but only getCachedResult.
Probably because
of potential
pipelining or
concurrency
issues.<br>
[2] For example,
potentially
breaking up
multiple pipelined
loop passes and
insert
RequireAnalysisPass<BlockFrequencyAnalysis>
in front of each
of them.<br>
[3] For example,
-fprofile-instr-use
and
-fprofile-sample-use
aren’t present in
ThinLTO post link
builds.<br>
[4] For example,
we could check
whether the module
has the profile
summary metadata
annotated when
building pass
pipelines but we
don’t always pass
the module down to
the place where we
build pass
pipelines.<br>
[5] By inserting
RequireAnalysisPass<ProfileSummaryInfo>
after the
PGOInstrumentationUse
and the
SampleProfileLoaderPass
passes (and around
the
PGOIndirectCallPromotion
pass for the Thin
LTO post link
pipeline.)<br>
[6] For example,
the
context-sensitive
PGO<span
class="gmail_default"
style="font-family:arial,helvetica,sans-serif">.</span><br>
[7] Directly
calling its
constructor along
with the dependent
analyses results,
eg. the jump
threading pass.</div>
</div>
</div>
<br>
<fieldset
class="gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-quote-pre">_______________________________________________
LLVM Developers mailing list
<a class="gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
<a class="gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a
href="mailto:llvm-dev@lists.llvm.org"
target="_blank"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer"
target="_blank"
moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
</body>
</html>