<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Mar 13, 2019 at 2:37 PM Fedor Sergeev <<a href="mailto:fedor.sergeev@azul.com">fedor.sergeev@azul.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    Overall seems fine to me.<br>
    <br>
    <div class="gmail-m_6258983735616516850moz-cite-prefix">On 3/11/19 8:12 PM, Hiroshi Yamauchi
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">
                    <div style="font-family:arial,helvetica,sans-serif">Here's
                      a revised approach based on the discussion:</div>
                    <div><font face="arial,
                        helvetica, sans-serif"><br>
                      </font></div>
                    <div><font face="arial,
                        helvetica, sans-serif">- Cache PSI right after
                        the profile summary in the IR is written in the
                        pass pipeline. This would avoid the need to
                        insert RequireAnalysisPass for PSI before each
                        non-module pass that needs it. PSI can be
                        technically invalidated but unlikely as PSI is
                        immutable. If it does, we can insert another
                        RequireAnalysisPass.<br>
                      </font></div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <font face="arial, helvetica, sans-serif">ProfileSummaryInfo::invalidate
      always return false, so it does not need any extra handling<br>
      (as soon as it finds its way into ModuleAnalysisManager).<br></font></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Right, as long as there happens to be a pass that always runs PSI before any pass that expects it to cached, it'd be fine. This is to be extra reassuring by having PSI explicitly run and cached right after a pass that writes the profile summary to the IR so that there's no window where PSI may not be cached.</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><blockquote type="cite"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">
                    <div><font face="arial,
                        helvetica, sans-serif">- If PGO, conditionally
                        request BFI from the passes that need it. For
                        (pipelined) loop passes, we need to i</font><span style="font-family:arial,helvetica,sans-serif">nsert
                        a pass that computes BFI conditionally (if PGO)
                        in front of them and make them preserve BFI
                        through. This is to avoid pipeline interruptions
                        and </span><font face="arial, helvetica,
                        sans-serif">potential </font><span style="font-family:arial,helvetica,sans-serif">invalidation/recomputation
                        of BFI between the loop passes. We detect PGO
                        based on whether PSI has profile summary info.
                        (For the old pass manager, implement a similar
                        approach by using LazyBlockFrequencyInfo.)</span></div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    There is already an optional analysis in LoopStandardAnalysisResults
    - MemorySSA.<br>
    We can do the same for BFI/BPI.<br>
    And, yes - preserving those through loop passes is a cornerstone to
    this approach.<br>
    <blockquote type="cite">
      <div dir="ltr">
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">
                    <div><font face="arial,
                        helvetica, sans-serif"><br>
                      </font></div>
                    <div><font face="arial,
                        helvetica, sans-serif">- Add a new proxy
                        ModuleAnalysisManagerLoopProxy for a loop pass
                        to be able to get to the ModuleAnalysisManager
                        in one step and PSI through it.</font></div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <font face="arial, helvetica, sans-serif">This is just an
      optimization of compile-time, saves one indirection through
      FunctionAnalysisManager.<br>
      I'm not even sure if it is worth the effort. And definitely not
      crucial for the overall idea.<br></font></div></blockquote><div><br></div><div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">This should probably be clarified to something like:</div><br></div><div><div class="gmail_default"><div class="gmail_default"><font face="arial, helvetica, sans-serif">- Add a new proxy ModuleAnalysisManagerLoopProxy for a loop pass to be able to get to the ModuleAnalysisManager and PSI because it may not always through (const) FunctionAnalysisManager, </font><span style="font-family:arial,helvetica,sans-serif">unless ModuleAnalysisManagerFunctionProxy is already cached.</span></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"><br></font></div></div></div></div><div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Since FunctionAnalysisManager we can get from LoopAnalysisManager is a const ref, we cannot call getResult on it and </font><span style="font-family:arial,helvetica,sans-serif">always </span><span style="font-family:arial,helvetica,sans-serif">get ModuleAnalysisManager and</span><span style="font-family:arial,helvetica,sans-serif"> PSI (see below.) This actually happens in my experiment.</span></div></div><div><br></div><div><font face="monospace, monospace">SomeLoopPass::run(Loop &L, LoopAnalysisManager &LAM, …) {</font></div><div><font face="monospace, monospace">  auto &FAM = LAM.getResult<FunctionAnalysisManagerLoopProxy>(L, AR).getManager();</font></div><div><font face="monospace, monospace">  auto *MAMProxy = FAM.getCachedResult<ModuleAnalysisManagerFunctionProxy>(</font></div><div><font face="monospace, monospace">      L.getHeader()->getParent()); <b>// Can be null</b></font></div><div><font face="monospace, monospace">  If (MAMProxy) {</font></div><div><font face="monospace, monospace">    auto &MAM = MAMProxy->getManager();</font></div><div><font face="monospace, monospace">    auto *PSI = MAM.getCachedResult<ProfileSummaryAnalysis>(*F.getParent());</font></div><div><font face="monospace, monospace">  } else {</font></div><div><font face="monospace, monospace">    <b>// <span class="gmail_default">Can't get MAM and </span>PSI.</b></font></div><div><font face="monospace, monospace">  }</font></div><div><font face="monospace, monospace">  ...</font></div><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">-></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div><font face="monospace, monospace">SomeLoopPass::run(Loop &L, LoopAnalysisManager &LAM, …) {</font></div><div><font face="monospace, monospace">  <span class="gmail_default">auto &MAM = LAM.getResult<ModuleAnalysisManagerLoopProxy>(L, AR).getManager();  </span><b>// <span class="gmail_default">Not</span> null</b></font></div><div><font face="monospace, monospace">  auto *PSI = MAM.getCachedResult<ProfileSummaryAnalysis>(*F.getParent());</font></div><div class="gmail_default"><font face="monospace, monospace">  ...</font></div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><div><div class="gmail_default"><div class="gmail_default"><font face="arial, helvetica, sans-serif">AFAICT, adding ModuleAnalysisManagerLoopProxy seems to be as simple as:</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"><br></font></div><div style=""><div style=""><font face="monospace, monospace">/// A proxy from a \c ModuleAnalysisManager to a \c Loop.</font></div><div style=""><font face="monospace, monospace">typedef OuterAnalysisManagerProxy<ModuleAnalysisManager, Loop,</font></div><div style=""><font face="monospace, monospace">                                  LoopStandardAnalysisResults &></font></div><div style=""><font face="monospace, monospace">    ModuleAnalysisManagerLoopProxy;</font></div><div style="font-family:arial,helvetica,sans-serif"><br></div></div></div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><font face="arial, helvetica, sans-serif">
      <br>
      regards,<br>
        Fedor.<br>
    </font>
    <blockquote type="cite">
      <div dir="ltr">
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">
                    <div><font face="arial,
                        helvetica, sans-serif"><br>
                      </font></div>
                    <div><font face="arial,
                        helvetica, sans-serif"><br>
                      </font></div>
                    <div><font face="arial,
                        helvetica, sans-serif"><br>
                      </font></div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Mon, Mar 4, 2019 at 2:05 PM
          Fedor Sergeev <<a href="mailto:fedor.sergeev@azul.com" target="_blank">fedor.sergeev@azul.com</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div bgcolor="#FFFFFF"> <br>
            <br>
            <div class="gmail-m_6258983735616516850gmail-m_-1959058298805901267moz-cite-prefix">On
              3/4/19 10:49 PM, Hiroshi Yamauchi wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div dir="ltr">
                  <div style="font-family:arial,helvetica,sans-serif"><br>
                  </div>
                </div>
                <br>
                <div class="gmail_quote">
                  <div dir="ltr" class="gmail_attr">On Mon, Mar 4, 2019
                    at 10:55 AM Hiroshi Yamauchi <<a href="mailto:yamauchi@google.com" target="_blank">yamauchi@google.com</a>>
                    wrote:<br>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                    <div dir="ltr">
                      <div dir="ltr">
                        <div style="font-family:arial,helvetica,sans-serif"><br>
                        </div>
                      </div>
                      <br>
                      <div class="gmail_quote">
                        <div dir="ltr" class="gmail_attr">On Sat, Mar 2,
                          2019 at 12:58 AM Fedor Sergeev <<a href="mailto:fedor.sergeev@azul.com" target="_blank">fedor.sergeev@azul.com</a>>
                          wrote:<br>
                        </div>
                        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                          <div bgcolor="#FFFFFF"> <br>
                            <br>
                            <div class="gmail-m_6258983735616516850gmail-m_-1959058298805901267gmail-m_-7727846899844856475gmail-m_-4863800615731490931moz-cite-prefix">On
                              3/2/19 2:38 AM, Hiroshi Yamauchi wrote:<br>
                            </div>
                            <blockquote type="cite">
                              <div dir="ltr">
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div dir="ltr">
                                      <div dir="ltr">
                                        <div dir="ltr">
                                          <div dir="ltr">
                                            <div dir="ltr">
                                              <div dir="ltr">
                                                <div dir="ltr">
                                                  <div style="font-family:arial,helvetica,sans-serif">
                                                    <div style="font-family:Arial,Helvetica,sans-serif"><span style="font-family:arial,helvetica,sans-serif">Here's a sketch of the
                                                        proposed
                                                        approach for
                                                        just one pass<span class="gmail_default"> (but imagine more)</span></span></div>
                                                    <div style="font-family:Arial,Helvetica,sans-serif"><span style="font-family:arial,helvetica,sans-serif"><br>
                                                      </span></div>
                                                    <div style="font-family:Arial,Helvetica,sans-serif"><span style="font-family:arial,helvetica,sans-serif"><a href="https://reviews.llvm.org/D58845" target="_blank">https://reviews.llvm.org/D58845</a></span></div>
                                                    <div style="font-family:Arial,Helvetica,sans-serif"><br>
                                                    </div>
                                                  </div>
                                                </div>
                                                <div class="gmail_quote">
                                                  <div dir="ltr" class="gmail_attr">On
                                                    Fri, Mar 1, 2019 at
                                                    12:54 PM Fedor
                                                    Sergeev via llvm-dev
                                                    <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
                                                    wrote:<br>
                                                  </div>
                                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                    <div bgcolor="#FFFFFF">
                                                      On 2/28/19 12:47
                                                      AM, Hiroshi
                                                      Yamauchi via
                                                      llvm-dev wrote:<br>
                                                      <blockquote type="cite">
                                                        <div dir="ltr">
                                                          <div dir="ltr"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hi
                                                          all,</span>
                                                          <div><br>
                                                          </div>
                                                          <div>To
                                                          implement more
                                                          profile-guided
                                                          optimizations,
                                                          we’d like to
                                                          use
                                                          ProfileSummaryInfo
                                                          (PSI) and
                                                          BlockFrequencyInfo
                                                          (BFI) from
                                                          more passes of
                                                          various types<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">,
                                                          under the new
                                                          pass manager.</span></div>
                                                          <div><br>
                                                          </div>
                                                          <div>
                                                          <div style="font-family:arial,helvetica,sans-serif">The
                                                          following is
                                                          what we came
                                                          up with. Would
                                                          appreciate
                                                          feedback.
                                                          Thanks.</div>
                                                          <div style="font-family:arial,helvetica,sans-serif"><br>
                                                          </div>
                                                          Issue<br>
                                                          <br>
                                                          It’s not
                                                          obvious (to
                                                          me) how to
                                                          best do this,
                                                          given that we
                                                          cannot request
                                                          an outer-scope
                                                          analysis
                                                          result from an
                                                          inner-scope
                                                          pass through
                                                          analysis
                                                          managers [1]
                                                          and that we
                                                          might
                                                          unnecessarily
                                                          running some
                                                          analyses
                                                          unless we
                                                          conditionally
                                                          build pass
                                                          pipelines for
                                                          PGO cases.<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </blockquote>
                                                      Indeed, this is an
                                                      intentional
                                                      restriction in new
                                                      pass manager,
                                                      which is more or
                                                      less a reflection
                                                      of a fundamental
                                                      property of
                                                      outer-inner IRUnit
                                                      relationship<br>
                                                      and
                                                      transformations/analyses
                                                      run on those
                                                      units. The main
                                                      intent for having
                                                      those inner
                                                      IRUnits (e.g.
                                                      Loops) is to run
                                                      local
                                                      transformations
                                                      and save compile
                                                      time<br>
                                                      on being local to
                                                      a particular small
                                                      piece of IR. Loop
                                                      Pass manager
                                                      allows you to run
                                                      a whole pipeline
                                                      of different
                                                      transformations
                                                      still locally,
                                                      amplifying the
                                                      save.<br>
                                                      As soon as you run
                                                      function-level
                                                      analysis from
                                                      within the loop
                                                      pipeline you
                                                      essentially break
                                                      this pipelining.<br>
                                                      Say, as you run
                                                      your loop
                                                      transformation it
                                                      modifies the loop
                                                      (and the function)
                                                      and potentially
                                                      invalidates the
                                                      analysis,<br>
                                                      so you have to
                                                      rerun your
                                                      analysis again and
                                                      again. Hence
                                                      instead of saving
                                                      on compile time it
                                                      ends up increasing
                                                      it.<br>
                                                    </div>
                                                  </blockquote>
                                                  <div><br>
                                                  </div>
                                                  <div>
                                                    <div style="font-family:arial,helvetica,sans-serif">Exactly.</div>
                                                  </div>
                                                  <div style="font-family:arial,helvetica,sans-serif">
                                                    <div><br>
                                                    </div>
                                                  </div>
                                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                    <div bgcolor="#FFFFFF">
                                                      <br>
                                                      I have hit this
                                                      issue somewhat
                                                      recently with
                                                      dependency of loop
                                                      passes on
                                                      BranchProbabilityInfo.<br>
                                                      (some loop passes,
                                                      like IRCE can use
                                                      it for
                                                      profitability
                                                      analysis). </div>
                                                  </blockquote>
                                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                    <div bgcolor="#FFFFFF">
                                                      The only solution
                                                      that appears to be
                                                      reasonable there
                                                      is to teach all
                                                      the loops passes
                                                      that need to be
                                                      pipelined<br>
                                                      to preserve BPI
                                                      (or any other
                                                      module/function-level
                                                      analyses) similar
                                                      to how they
                                                      preserve
                                                      DominatorTree and<br>
                                                      other
                                                      "LoopStandard"
                                                      analyses.<br>
                                                    </div>
                                                  </blockquote>
                                                  <div><br>
                                                  </div>
                                                  <div>
                                                    <div style="font-family:arial,helvetica,sans-serif">Is
                                                      this implemented -
                                                      do the loop passes
                                                      preserve BPI?</div>
                                                  </div>
                                                </div>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </blockquote>
                            Nope, not implemented right now.<br>
                            One of the problems is that even loop
                            canonicalization passes run at the start of
                            loop pass manager dont preserve it<br>
                            (and at least LoopSimplifyCFG does change
                            control flow).<br>
                            <blockquote type="cite">
                              <div dir="ltr">
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div dir="ltr">
                                      <div dir="ltr">
                                        <div dir="ltr">
                                          <div dir="ltr">
                                            <div dir="ltr">
                                              <div dir="ltr">
                                                <div class="gmail_quote">
                                                  <div style="font-family:arial,helvetica,sans-serif"><br>
                                                  </div>
                                                  <div>
                                                    <div style="font-family:arial,helvetica,sans-serif"><font face="arial,
                                                        helvetica,

                                                        sans-serif">In
                                                        buildFunctionSimplificationPipeline
(where LoopFullUnrollPass is added as in the sketch), </font>LateLoopOptimizationsEPCallbacks
and LoopOptimizerEndEPCallbacks seem to allow some arbitrary loop passes
                                                      to be inserted
                                                      into the pipelines
                                                      (via flags)?</div>
                                                  </div>
                                                  <div><br>
                                                  </div>
                                                  <div>
                                                    <div><span style="font-family:arial,helvetica,sans-serif">I
                                                        wonder <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">how hard </span>it'd be
                                                        to teach all the
                                                        <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">relevant </span>loop
                                                        passes to
                                                        preserve BFI<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">
                                                          (or BPI)</span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">..</span></span></div>
                                                  </div>
                                                </div>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </blockquote>
                            Well, each time you restructure control flow
                            around the loops you will have to update
                            those extra analyses,<br>
                            pretty much the same way as DT is being
                            updated through DomTreeUpdater.<br>
                            The trick is to design a proper update
                            interface (and then implement it ;) ).<br>
                            And I have not spent enough time on this
                            issue to get a good idea of what that
                            interface would be.<br>
                          </div>
                        </blockquote>
                        <div><br>
                        </div>
                        <div>
                          <div style="font-family:arial,helvetica,sans-serif">Hm,
                            sounds non-trivial :) noting BFI depends on
                            BPI.</div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <div><br>
                  </div>
                  <div>
                    <div style="font-family:arial,helvetica,sans-serif">To
                      step back, it looks like:</div>
                    <div style="font-family:arial,helvetica,sans-serif"><br>
                    </div>
                    <div style="font-family:arial,helvetica,sans-serif">want
                      to use profiles from more passes -> need to get
                      BFI (from loop passes) -> need all the loop
                      passes to preserve BFI.</div>
                  </div>
                  <div style="font-family:arial,helvetica,sans-serif"><br>
                  </div>
                  <div style="font-family:arial,helvetica,sans-serif">I
                    wonder if there's no way around this.</div>
                </div>
              </div>
            </blockquote>
            Indeed. I believe this is a general consensus here.<br>
            <br>
            regards,<br>
              Fedor.<br>
            <br>
            <blockquote type="cite">
              <div dir="ltr">
                <div class="gmail_quote">
                  <div style="font-family:arial,helvetica,sans-serif"><br>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                    <div dir="ltr">
                      <div class="gmail_quote">
                        <div>
                          <div style="font-family:arial,helvetica,sans-serif"><br>
                          </div>
                        </div>
                        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                          <div bgcolor="#FFFFFF"> <br>
                            regards,<br>
                              Fedor.<br>
                            <br>
                            <blockquote type="cite">
                              <div dir="ltr">
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div dir="ltr">
                                      <div dir="ltr">
                                        <div dir="ltr">
                                          <div dir="ltr">
                                            <div dir="ltr">
                                              <div dir="ltr">
                                                <div class="gmail_quote">
                                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                    <div bgcolor="#FFFFFF">
                                                      <br>
                                                      <blockquote type="cite">
                                                        <div dir="ltr">
                                                          <div dir="ltr">
                                                          <div>It seems
                                                          that for
                                                          different
                                                          types of
                                                          passes to be
                                                          able to get
                                                          PSI and BFI,
                                                          we’d need to
                                                          ensure PSI is
                                                          cached for a
                                                          non-module
                                                          pass, and PSI,
                                                          BFI and the
                                                          ModuleAnalysisManager
                                                          proxy are
                                                          cached for a
                                                          loop pass in
                                                          the pass
                                                          pipelines.
                                                          This may mean
                                                          potentially
                                                          needing to
                                                          insert BFI/PSI
                                                          in front of
                                                          many passes
                                                          [2]. It seems
                                                          not obvious
                                                          how to
                                                          conditionally
                                                          insert BFI for
                                                          PGO pipelines
                                                          because there
                                                          isn’t always a
                                                          good flag to
                                                          detect PGO
                                                          cases [3] or
                                                          we tend to
                                                          build pass
                                                          pipelines
                                                          before
                                                          examining the
                                                          code (or
                                                          without
                                                          propagating
                                                          enough info
                                                          down) [4].<br>
                                                          <br>
                                                          Proposed
                                                          approach<br>
                                                          <br>
                                                          - Cache PSI
                                                          right after
                                                          the profile
                                                          summary in the
                                                          IR is written
                                                          in the pass
                                                          pipeline [5].
                                                          This would
                                                          avoid the need
                                                          to insert
RequiredAnalysisPass for PSI before each non-module pass that needs it.
                                                          PSI can be
                                                          technically
                                                          invalidated
                                                          but unlikely.
                                                          If it does, we
                                                          insert another
RequiredAnalysisPass<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> <span style="font-family:Arial,Helvetica,sans-serif">[6].</span></span><br>
                                                          <br>
                                                          -
                                                          Conditionally
                                                          insert
                                                          RequireAnalysisPass
                                                          for BFI, if
                                                          PGO, right
                                                          before each
                                                          loop pass that
                                                          needs it. This
                                                          doesn't seem
                                                          avoidable
                                                          because BFI
                                                          can be
                                                          invalidated
                                                          whenever the
                                                          CFG changes.
                                                          We detect PGO
                                                          based on the
                                                          command line
                                                          flags and<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">/or</span>
                                                          whether the
                                                          module has the
                                                          profile
                                                          summary info
                                                          (we may need
                                                          to pass the
                                                          module to more
                                                          functions.)<br>
                                                          <br>
                                                          - Add a new
                                                          proxy
ModuleAnalysisManagerLoopProxy for a loop pass to be able to get to the
ModuleAnalysisManager in one step and PSI through it.<br>
                                                          <br>
                                                          Alternative
                                                          approaches<br>
                                                          <br>
                                                          Dropping BFI
                                                          and use PSI
                                                          only<br>
                                                          We could
                                                          consider not
                                                          using BFI and
                                                          solely relying
                                                          on PSI and
                                                          function-level
                                                          profiles only
                                                          (as opposed to
                                                          block-level),
                                                          but profile
                                                          precision
                                                          would suffer.<br>
                                                          <br>
                                                          Computing BFI
                                                          in-place<br>
                                                          We could
                                                          consider
                                                          computing BFI
                                                          “in-place” by
                                                          directly
                                                          running BFI
                                                          outside of the
                                                          pass manager
                                                          [7]. This
                                                          would let us
                                                          avoid using
                                                          the analysis
                                                          manager
                                                          constraints
                                                          but it would
                                                          still involve
                                                          running an
                                                          outer-scope
                                                          analysis from
                                                          an inner-scope
                                                          pass and
                                                          potentially
                                                          cause problems
                                                          in terms of
                                                          pass
                                                          pipelining and
                                                          concurrency.
                                                          Moreover, a
                                                          potential
                                                          downside of
                                                          running
                                                          analyses
                                                          in-place is
                                                          that it won’t
                                                          take advantage
                                                          of cached
                                                          analysis
                                                          results
                                                          provided by
                                                          the pass
                                                          manager.<br>
                                                          <br>
                                                          Adding
                                                          inner-scope
                                                          versions of
                                                          PSI and BFI<br>
                                                          We could
                                                          consider
                                                          adding a
                                                          function-level
                                                          and loop-level
                                                          PSI and
                                                          loop-level
                                                          BFI, which
                                                          internally act
                                                          like their
                                                          outer-scope
                                                          versions but
                                                          provide
                                                          inner-scope
                                                          results only.
                                                          This way, we
                                                          could always
                                                          call getResult
                                                          for PSI and
                                                          BFI. However,
                                                          this would
                                                          still involve
                                                          running an
                                                          outer-scope
                                                          analysis from
                                                          an inner-scope
                                                          pass.<br>
                                                          <br>
                                                          Caching the
                                                          FAM and the
                                                          MAM proxies<br>
                                                          We could
                                                          consider
                                                          caching the
                                                          FunctionalAnalysisManager
                                                          and the
                                                          ModuleAnalysisManager
                                                          proxies once
                                                          early on
                                                          instead of
                                                          adding a new
                                                          proxy. But it
                                                          seems to not
                                                          likely work
                                                          well because
                                                          the analysis
                                                          cache key type
                                                          includes the
                                                          function or
                                                          the module and
                                                          some pass may
                                                          add a new
                                                          function for
                                                          which the
                                                          proxy wouldn’t
                                                          be cached.
                                                          We’d need to
                                                          write and
                                                          insert a pass
                                                          in select
                                                          locations to
                                                          just fill the
                                                          cache. Adding
                                                          the new proxy
                                                          would take
                                                          care of these
                                                          with a
                                                          three-line
                                                          change.<br>
                                                          <br>
                                                          Conditional
                                                          BFI<br>
                                                          We could
                                                          consider
                                                          adding a
                                                          conditional
                                                          BFI analysis
                                                          that is a
                                                          wrapper around
                                                          BFI and
                                                          computes BFI
                                                          only if
                                                          profiles are
                                                          available
                                                          (either
                                                          checking the
                                                          module has
                                                          profile
                                                          summary or
                                                          depend on the
                                                          PSI.) With
                                                          this, we
                                                          wouldn’t need
                                                          to
                                                          conditionally
                                                          build pass
                                                          pipelines and
                                                          may work for
                                                          the new pass
                                                          manager. But a
                                                          similar
                                                          wouldn’t work
                                                          for the old
                                                          pass manager
                                                          because we
                                                          cannot
                                                          conditionally
                                                          depend on an
                                                          analysis under
                                                          it.<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </blockquote>
                                                      There is
                                                      LazyBlockFrequencyInfo.<br>
                                                      Not sure how well
                                                      it fits this idea.<br>
                                                    </div>
                                                  </blockquote>
                                                  <div><br>
                                                  </div>
                                                  <div>
                                                    <div><font face="arial,

                                                        helvetica,
                                                        sans-serif">Good
                                                        point.
                                                        LazyBlockFrequencyInfo
                                                        seems usable
                                                        with the old
                                                        pass manager
                                                        (save
                                                        unnecessary
                                                        BFI/BPI) and
                                                        would work for
                                                        function passes.
                                                        I think t</font><span style="font-family:arial,helvetica,sans-serif">he </span><span style="font-family:arial,helvetica,sans-serif">restriction
                                                        still applies - </span><span style="font-family:arial,helvetica,sans-serif">a loop pass cannot still
                                                        request
                                                        (outer-scope)
                                                        BFI, lazy or
                                                        not, new or old
                                                        (pass manager).
                                                        Another
                                                        assumption is
                                                        that </span><span style="font-family:arial,helvetica,sans-serif">it'd be cheap and safe to
                                                        unconditionally
                                                        depend on PSI or
                                                        check the
                                                        module's profile
                                                        summary.</span></div>
                                                  </div>
                                                  <div><br>
                                                  </div>
                                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                    <div bgcolor="#FFFFFF">
                                                      <br>
                                                      regards,<br>
                                                        Fedor.<br>
                                                      <br>
                                                      <blockquote type="cite">
                                                        <div dir="ltr">
                                                          <div dir="ltr">
                                                          <div><br>
                                                          <br>
                                                          [1] We cannot
                                                          call
AnalysisManager::getResult for an outer scope but only getCachedResult.
                                                          Probably
                                                          because of
                                                          potential
                                                          pipelining or
                                                          concurrency
                                                          issues.<br>
                                                          [2] For
                                                          example,
                                                          potentially
                                                          breaking up
                                                          multiple
                                                          pipelined loop
                                                          passes and
                                                          insert
                                                          RequireAnalysisPass<BlockFrequencyAnalysis>
                                                          in front of
                                                          each of them.<br>
                                                          [3] For
                                                          example,
                                                          -fprofile-instr-use
                                                          and
                                                          -fprofile-sample-use
                                                          aren’t present
                                                          in ThinLTO
                                                          post link
                                                          builds.<br>
                                                          [4] For
                                                          example, we
                                                          could check
                                                          whether the
                                                          module has the
                                                          profile
                                                          summary
                                                          metadata
                                                          annotated when
                                                          building pass
                                                          pipelines but
                                                          we don’t
                                                          always pass
                                                          the module
                                                          down to the
                                                          place where we
                                                          build pass
                                                          pipelines.<br>
                                                          [5] By
                                                          inserting
                                                          RequireAnalysisPass<ProfileSummaryInfo>
                                                          after the
                                                          PGOInstrumentationUse
                                                          and the
                                                          SampleProfileLoaderPass
                                                          passes (and
                                                          around the
                                                          PGOIndirectCallPromotion
                                                          pass for the
                                                          Thin LTO post
                                                          link
                                                          pipeline.)<br>
                                                          [6] For
                                                          example, the
                                                          context-sensitive
                                                          PGO<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">.</span><br>
                                                          [7] Directly
                                                          calling its
                                                          constructor
                                                          along with the
                                                          dependent
                                                          analyses
                                                          results, eg.
                                                          the jump
                                                          threading
                                                          pass.</div>
                                                          </div>
                                                        </div>
                                                        <br>
                                                        <fieldset class="gmail-m_6258983735616516850gmail-m_-1959058298805901267gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437mimeAttachmentHeader"></fieldset>
                                                        <pre class="gmail-m_6258983735616516850gmail-m_-1959058298805901267gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-quote-pre">_______________________________________________
LLVM Developers mailing list
<a class="gmail-m_6258983735616516850gmail-m_-1959058298805901267gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a class="gmail-m_6258983735616516850gmail-m_-1959058298805901267gmail-m_-7727846899844856475gmail-m_-4863800615731490931m_3084483561612773275gmail-m_5869029522365402437moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
                                                      </blockquote>
                                                      <br>
                                                    </div>
_______________________________________________<br>
                                                    LLVM Developers
                                                    mailing list<br>
                                                    <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
                                                    <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                                                  </blockquote>
                                                </div>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </blockquote>
                            <br>
                          </div>
                        </blockquote>
                      </div>
                    </div>
                  </blockquote>
                </div>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div></div></div></div></div></div></div></div></div></div></div></div>