[PATCH] Propagation of profile samples through the CFG.

Fri Jan 10 12:15:32 PST 2014

On Fri, Jan 10, 2014 at 11:59 AM, Diego Novillo <dnovillo at google.com> wrote:

> On Thu, Jan 9, 2014 at 7:35 AM, Chandler Carruth <chandlerc at gmail.com>
> wrote:
>
> > It was not clear to me at first that this is an inversion flag. I'd like
> a better name, but I don't have any particular ideas. I wonder if it would
> be more clear by passing in the domtree pointer to check?
>
> Done. I changed the parameter to pass a DominatorTreeBase pointer. The
> caller passes DT->DT or PDT->DT accordingly.
>
>
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:563-565
> > @@ +562,5 @@
> > +    BasicBlock *BB2 = *I;
> > +    bool IsDomParent =
> > +        (CheckDT) ? DT->dominates(BB2, BB1) : PDT->dominates(BB2, BB1);
> > +    bool IsInSameLoop = LI->getLoopFor(BB1) == LI->getLoopFor(BB2);
> > +    if (BB1 != BB2 && VisitedBlocks.insert(BB2) && IsDomParent &&
> > ----------------
> > I suspect there is a more direct way to compute this, but I don't want
> to hold up the review thinking about it -- maybe a FIXME?
> >
> > Specifically, I think that we might be able to phrase this purely in
> terms of LoopInfo and DomTree.
>
> Sorry, I'm not following. This is expressed in terms of loop info and
> dominance. What do you want to change?
>

I'm hoping to avoid needing post-dominance information as we currently
don't compute it in the standard optimization pipeline. But again, this is
(IMO) really for a subsequent patch.

>
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:578-581
> > @@ +577,6 @@
> > +      // members of its equivalence set.
> > +      uint32_t &bb1_weight = BlockWeights[BB1];
> > +      uint32_t &bb2_weight = BlockWeights[BB2];
> > +      if (bb2_weight > bb1_weight)
> > +        bb1_weight = bb2_weight;
> > +    }
> > ----------------
> > bb1_weight = std::max(bb1_weight, bb2_weight);
>
> D'oh! Done.
>
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:620
> > @@ +619,3 @@
> > +    // class by making BB2's equivalence class be BB1.
> > +    SmallVector<BasicBlock *, 8> DominatedBBs;
> > +    DT->getDescendants(BB1, DominatedBBs);
> > ----------------
> > You should probably put this outside the loop.
>
> That would be wrong. BB1 changes with every iteration of the loop.
>

Sorry, I meant just the SmallVector so that we clear rather than reallocate.

>
> >
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:624-625
> > @@ +623,4 @@
> > +
> > +    // Repeat the same logic for all the blocks post-dominated by BB1.
> > +    // We are looking for every basic block BB2 such that:
> > +    //
> > ----------------
> > Why do we need to do this?
> >
> > If we visit every block as BB1, and then visit every block BB1 dominates
> as BB2, we will have considered all pairs of blocks where A dominates B,
> and tested for B post-dominating A. We don't need to consider pairs of
> blocks where B post-dominates A because we will reject the ones where A
> does not dominate B.
>
> Note that we do not necessarily visit every block as BB1. We only
> visit the blocks for which we have equivalence class information. This
> means that initially, only those blocks with samples in them have been
> assigned into an equivalence class.
>
> So, if in a if-then-else diamond, you only have samples inside the
> 'then' basic block, you need to look up from the 'then' to be able to
> propagate into the 'if'. Using just dominance properties for 'then'
> will not allow you to propagate its weight.
>

If you have an if-then-else diamond, you cannot propagate up into the if
block because the then block doesn't post-dominate the if block?

I still see the issue of the continue when you have already found an
equivalence class for the basic block... However due to the symmetry, I
can't yet think of a case where this is necessary.

It's fine to leave it here for now though. I can mull on it later.

>
>
> >
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:578
> > @@ +577,3 @@
> > +      // members of its equivalence set.
> > +      uint32_t &bb1_weight = BlockWeights[BB1];
> > +      uint32_t &bb2_weight = BlockWeights[BB2];
> > ----------------
> > Chandler Carruth wrote:
> >> bb1_weight = std::max(bb1_weight, bb2_weight);
> > Naming convention here.
>
> OK, though I'm pretty confused. Your line_iterator patch uses this
> convention. And, in fact, I've seen _ used as separator in several
> places. Do we have different conventions for different parts of the
> compiler? The coding guidelines don't seem to say anything about
> having different conventions.
>
> >
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:940
> > @@ +939,3 @@
> > +///
> > +///    - If all the edges are known except one, and the weight of the
> > +///      block is already known, the weight of the unknown edge will
> > ----------------
> > Incoming edges or outgoing edges? Or both?
>
> Incoming or outgoing. Clarified.
>
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:937-938
> > @@ +936,4 @@
> > +///
> > +///    - If B has a single predecessor/successor, then the weight
> > +///      of that edge is the weight of the block.
> > +///
> > ----------------
> > How do you handle contradictions?
>
> From user annotations? I would give a warning, if I could. Right now,
> it's a silent override.
>
> > ================
> > Comment at: lib/Transforms/Scalar/SampleProfile.cpp:951-953
> > @@ +950,5 @@
> > +///
> > +/// Since this propagation is not guaranteed to finalize for every CFG,
> we
> > +/// only allow it to proceed for a limited number of iterations
> (controlled
> > +/// by -sample-profile-max-propagate-iterations).
> > +///
> > ----------------
> > This is a really frustrating property. Can it be easily avoided?
>
> Not without changing the algorithm itself.
>
> > The first of the above steps is obviously deterministic.
> >
> > For the second and third, I feel like the following is guaranteed to
> make forward progress and terminate:
> > 0) Add all blocks to a list to process.
> > 1) Take the first block out of the list
> > 2) if we can add a weight to an edge connected to this block (including
> a self referential edge) do so and add the other block the edge connects to
> to the list (if not already in the list)
> > 3) repeat #2 until we cannot add a weight to any edge connected to this
> block
> > 4) if the list is not empty, go to #1
> > We only add a block to the list if we add a weight to an edge, so we can
> at most visit V+E blocks.
>
> Yes, I thought about a similar scheme (in fact, mimicking standard
> value propagation). But this changes the algorithm. Which is something
> I was trying to avoid at first.
>
>
> Diego.
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140110/97bb0a22/attachment.html>